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Introduction 


Background 

Prostate  cancer  poses  a  major  public  health  problem  in  the  United  States  and  worldwide.  It  has  the  highest 
incidence  and  is  the  second  most  common  cause  of  cancer  deaths  in  North  American  men  resulting  in  over 
30,000  deaths  per  annum.  Consequently,  there  is  an  urgent 
need  to  develop  novel  therapeutic  approaches.  The  molecular 
mechanisms  of  development  and  progression  of  prostate 
cancer  are  complicated  and  likely  to  involve  multiple  factors. 

The  human  double  minute  2  (Hdm2)  protein  is  amplified  or 
overexpressed  in  a  number  of  human  tumors,  including 
prostate  cancer.  Importantly,  the  Hdm2  antagonist  nutlin-3, 
which  is  particularly  effective  in  causing  p53-dependent 
apoptosis  in  Hdm2-amplified  cultured  cells,  exhibits  antitumor 
activity  on  human  prostate  LNCaP  and  other  xeno-grafts  in 
nude  mice.  Hdm2  promotes  p53  degradation  through  an 
ubiquitin-dependent  pathway  (Fig.  1).  The  exact  mechanism  by 
which  p53  is  stabilized  is  unclear,  although  a  series  of  post- 
translational  modifications  to  itself,  Hdm2  and  the  closely 
related  protein  Hdmx  (also  known  as  Hdm4),  are  thought  to 
dissociate  the  p53-Hdm2  complex  leading  to  increased  levels 
of  p53.  Although  Hdmx  contains  a  RING  domain  that  is  very 
similar  to  the  RING  domain  of  Hdm2,  it  does  not  possess 
intrinsic  E3  ubiquitin  ligase  activity.  Dimerization,  mediated  by 
the  conserved  C-terminal  RING  domains  of  both  Hdm2  and 
Hdmx,  appears  to  greatly  augment  this  activity.  While  the  Hdm 
RING  domains  can  form  homodimers,  heterodimers  form 
preferentially  resulting  in  reduced  auto-ubiquitylation  of  Hdm2 
and  increased  p53  ubiquitylation.  Thus  disruption  of  this 
interaction  should  inactivate  Hdm2  E3  ligase  activity  and 
consequently  increase  p53  abundance.  The  recent  elucidation 
of  the  structure  of  the  complex  formed  by  the  RING  domains  of 
Hdm2  and  Hdmx  suggests  the  feasibility  of  obtaining  Hdm- 
specific  E3  ligase  inhibitors  by  targeting  the  Hdm2/HdmX  RING 
domain  dimer  interface  rather  than  the  primary  E2  binding  site 
that  is  common  to  many  RING  domain  E3-ubiquitin  ligases  (Fig.  2). 

Objectives 

Disruption  of  Hdm2  function  is  a  very  novel 
attractive  therapeutic  target  for  prostate  cancer 
[2,  3]  (Fig.  1).  As  mentioned  earlier,  nutlin-3,  a 
peptidomimetic  that  disrupts  the  p53-Hdm2 
interaction,  activates  p53  pathways  both  in 
vitro  and  in  vivo  in  human  cell  lines  that 
possess  wild-type  p53  and  overexpress  Hdm2 
[4],  Vousden  and  colleagues  have  also 
established  that  it  is  possible  to  stabilize  p53 
by  directly  inhibiting  the  E3-ligase  activity  of 
Hdm2  [6],  although  this  approach  may  also 
inhibit  other  E3s.  The  recent  elucidation  of  the 
structure  of  the  Hdm2-Hdmx  RING  domain 
heterodimer  shows  that  both  protein  domains 
contribute  residues  for  E3-ligase  activity  (Fig. 

3).  This  strongly  suggests  that  it  might  be 
possible  to  obtain  Hdm-specific  E3  ligase 


Figure  2.  Hdm2-Hdmx  heterodimers  are  more  effective  p53  ubiquitin 
ligases  that  Hdm2  homodimers,  which  can  also  self-ubiquitylate 
themselves. 
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inhibitors  by  targeting  the  Hdm2-HdmX  RING  domain  dimer  interface  rather  than  the  primary  E2  binding  site 
that  is  common  to  many  RING  domain  E3-ubiquitin  ligases. 


To  achieve  this  objective  we  are  using  cell-based  libraries  of  cyclotides  for  selecting  specific  cyclotide 
sequences  able  to  antagonize  the  RING-mediated  interaction  between  Hdm2  and  Hdmx  to  obtain  Hdm-specific 
E3  ligase  inhibitors.  The  use  of  the  modified  protein  splicing  technology  developed  in  the  Camarero  lab  allows 
to  generate  large,  genetically-encoded  cyclotide  libraries  in  bacterial  cells.  These  cell-based  libraries  are  the 
screened  using  an  in-cell  FRET-based  reporter  in  combination  with  high  throughput  flow  cytometry  to  identify 
bacteria  encoding  cyclotides  able  to  disrupt  Hdm2-Hdmx  interactions.  Selected  cyclotides  are  then 
characterized  by  NMR  and  assayed  in  mammalian  cells  using  the  BiLC  assay  implemented  in  p53  wild-type 
and  p53-null  cancer  cell  lines  to  ascertain  p53-dependent  biological  activity. 

This  proposal  represents  a  novel  approach  for  antagonizing  Hdm2-Hdmx  E3-ligase  activity.  Selected 
cyclotides  will  be  highly  specific  for  antagonizing  Hdm2  E3-ligase,  and  for  eliciting  p53-dependent  cytotoxicity 
in  cancer  cells.  The  use  of  the  cyclotide  scaffold  will  enable  these  peptide  antagonists  to  have  the  required 
increased  stability,  cellular  membrane  penetration,  proteolysis  resistance  and  serum  clearance  needed  to  be 
considered  as  viable  drug  development  candidates.  It  is  also  important  to  remark,  that  this  cell-based 
technology  could  be  easily  adapted  to  screen  for  antagonists  for  other  relevant  protein-protein  interactions  in 
prostate  cancer;  for  example  those  that  may  induce  tumor  cell  apoptosis  independent  of  p53  or  compounds 
able  to  reactivate  mutant  p53.  These  compounds  could  be  used  in  combination  with  Hdm2  antagonists  to 
prevent  tumor  relapse  or  secondary  tumor  formation. 

Body 

A.  Specific  Aims 

We  are  using  a  cyclotide-based  molecular  scaffold  for  generating  molecular  libraries  that  are  screened  and 
selected  in  vivo  for  potential  antagonists  for  the  RING-mediated  Hdm2/Hdmx  interaction.  In  this  innovative 
approach,  we  are  using  cell-based  libraries  ( E .  coli  cell  libraries)  where  every  single  cell  will  express  a  different 
cyclotide,  in  what  we  could  call  a  single  cell-single  compound  approach.  These  compounds  are  then  screened 
and  selected  for  their  ability  to  inhibit  the  Hdm2/Hdmx  interaction  inside  the  bacterial  cell  using  a  genetically- 
encoded  FRET-based  reporter  [7]  in  combination  with  high  throughput  flow  cytometry  to  identify  bacteria 
encoding  cyclotides  able  to  disrupt  Hdm2-Hdmx  interactions.  This  screening  assay  is  optimized  to  be  used  in 
E.  coli  in  combination  fluorescence  activated  cell  sorting  (FACS)  and  is  designed  to  minimize  the  number  of 
false  positives.  The  Camarero  lab  has  developed  a  similar  assay  to  screen  cyclotides  able  to  inhibit  B. 
anthracis  Lethal  Factor  protease  activity  anthrax  toxin  binding,  demonstrating  the  feasibility  of  this  approach  for 
use  in  bacteria. 

Selected  cyclotides  are  also  characterized  by  NMR  and  assayed  in  mammalian  cells  using  the  BiLC  assay 
implemented  in  p53  wild-type  and  p53-null  cancer  cell  lines  to  ascertain  p53-dependent  biological  activity.  The 
BiLC  assay  developed  in  the  Wahl  lab  shows  a  high  dynamic  range,  high  degree  of  reproducibility  and  will  be 
used  to  validate  the  ability  of  any  cyclotide  selected  in  bacteria  to  enter  mammalian  cells  in  concentration 
sufficient  to  antagonize  Hdm2-Hdmx  RING-mediated  interaction. 

Specific  Aim  1.  To  screen  and  select  cyclotide-based  peptides  able  to  disrupt  the  Hdm2-Hdmx  RING 
heterodimer.  The  objectives  of  this  aim  are  the  production  of  large  genetically-encoded  libraries  of  cyclotides  in 
living  E.  coli  cells  (~1 09)  and  the  development  of  FRET-based  in  vivo  screening  reporter  to  select  cyclotides 
able  to  inhibit  Hdm2-Hdmx  RING  heterodimer.  Cells  able  to  express  active  cyclotides  will  be  selected  using 
high  throughput  flow  cytometry  methods  such  as  fluorescence  activated  cell  sorting  (FACS) 

Specific  Aim  2.  To  test  and  evaluate  the  inhibitory  and  biological  activity  of  selected  cyclotides.  Selected 
cyclotides  will  be  tested  in  vitro  using  a  combination  of  fluorescence  assays  and  nuclear  magnetic  resonance 
(NMR).  Biological  activity  will  be  assayed  using  different  cancer  cell  lines  to  evaluate  their  ability  to  activate 
endogenous  p53. 

B.  Studies  and  Results 

1)  Biosynthesis  and  characterization  of  genetically-encoded  cyclotide-based  libraries.  Our  group  has  recently 
developed  and  successfully  used  a  bio-mimetic  approach  for  the  biosynthesis  of  folded  cyclotides  inside  cells 
by  making  use  of  modified  protein  splicing  unit  [8].  Using  this  approach,  we  have  biosynthesized  a  small 


genetically-encoded  library 
based  on  the  cyclotide 
MCoTI-l  [9]  (See  paper  #7  in 
the  Appendix  Section  for 
further  details).  This 
cyclotide  MCoTI-l  is  a 
powerful  trypsin  inhibitor 
recently  isolated  from  the 
seeds  of  Momordica 
cochinchinensis,  a  plant 
member  of  cucurbitaceae 
family.  In  order  to  explore 
the  contribution  of  individual 
residues  to  biological  activity 
and  structural  integrity  a 
small  library  containing 
multiple  amino  acid  mutants 
was  generated  in  E.  coli 
cells  and  its  activity  assayed 
using  a  trypsin-binding 
assay.  Using  competition¬ 
binding  experiments  we 
were  able  to  estimate  the 
relative  binding  propensities 
of  the  different  mutants. 
These  data  provide 
significant  insights  into  the 
structural  constraints  of  the 
MCoTI  cyclotide  framework 
and  the  functional  elements 
for  trypsin  binding.  To  our 
knowledge,  this  is  the  first 
time  that  the  biosynthesis  of  a  genetically-encoded  library  of  MCoTI-based  cyclotides  containing  a 
comprehensive  suite  of  amino  acid  mutants  is  reported.  The  mutagenesis  results  obtained  in  our  work 
highlighting  the  extreme  robustness  of  the  cyclotide  scaffold  to  mutations  (Fig.  3).  Only  two  of  the  27  mutations 
studied  in  the  cyclotide  MCoTI-l,  affected  negatively  the  adoption  a  native  cyclotide  fold.  Intriguingly,  the  rest  of 
the  mutations  allowed  the  adoption  of  a  native  fold  as  indicated  by  ES-MS  analysis  and  their  ability  to  bind 
trypsin.  These  results  should  provide  an  excellent  starting  point  for  the  effective  design  of  MCoTI-based 
cyclotide  libraries  for  rapid  screening  and  selection  of  de  novo  cyclotide  sequences  with  specific  biological 
activities. 

We  have  also  used  our  unique  ability  to  express  folded  cyclotides  using  bacterial  expression  systems  to 
incorporate  NMR-active  nuclei  such  as  15N  into  folded  cyclotides  to  study  the  backbone  dynamics  of  the  this 
extraordinary  molecular  scaffold  [10]  (See  paper  #  4  in  the  Appendix  Section  for  further  details).  Determination 
of  the  backbone  dynamics  of  these  fascinating  micro-proteins  is  key  for  understanding  their  physical  and 
biological  properties.  Internal  motions  of  a  protein  on  different  time  scales,  extending  from  picoseconds  to 
second,  have  been  suggested  to  play  an  important  role  in  its  biological  function.  A  better  understanding  of  the 
backbone  dynamics  of  the  cyclotide  scaffold  is  extremely  helpful  for  evaluating  its  utility  as  a  scaffold  for 
developing  specific  protein-capture  reagents.  Such  insight  will  help  us  in  the  design  of  optimal  focused  libraries 
than  can  be  used  for  the  discovery  of  new  cyclotides  sequences  with  novel  biological  activities.  We  have 
recently  reported  the  backbone  dynamics  of  the  cyclotide  MCoTI-l  in  the  free  state  and  complexed  to  its 
binding  partner  trypsin  in  solution  [10]  (Fig.  4).  This  is  the  first  time  the  backbone  dynamics  of  a  natively  folded 
cyclotide  has  been  reported  in  the  literature.  Our  results  on  the  backbone  dynamics  of  free  cyclotide  MCoTI-l 
confirm  that  MCoTI-l  adopts  a  well-folded  and  highly  compact  structure  with  an  <S2>  value  of  0.83.  This  value 
is  similar  to  those  found  in  the  regions  of  well-folded  proteins  with  restricted  backbone  dynamics.  More 
surprisingly,  however,  was  the  fact  that  the  backbone  of  MCoTI-l,  and  specially  the  binding  loop,  increased  ps- 
ns  mobility  when  bound  to  trypsin.  This  increment  in  backbone  mobility  may  help  to  minimize  the  entropic 
penalties  required  for  binding.  This  dynamic  decoupling  between  the  side-chain  terminus  from  the  rest  of  the 


Figure  3.  Summary  of  the  relative  affinities  for  trypsin  of  the  different  MCoTI-l  mutants  studied 
in  this  work  [2].  A  model  of  cyclotide  MCoTI-l  bound  to  trypsin  is  shown  at  the  bottom  and 
indicates  the  positions  of  the  mutations.  The  Lys4  side  chain  is  shown  in  red  bound  to  the 
specificity  pocket  of  trypsin.  The  model  was  produced  by  homology  modeling  at  the  Swiss 
model  workspace  (http://swissmodel.expasy.Org//SWISS-MODEL.html)  by  usingthe  structure  of 
CPTI-I l-trypsin  complex  (PDB  ID:  2btc)  as  the  template. 
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aliphatic  part  of  the  side-chain  may  be  a  general  biophysical  strategy  for  maximizing  residual  side-chain  and 
potentially  backbone 
conformational  entropy  in 
proteins  and  their  complexes. 

Using  these  data  we  have 
recently  produced  a 
combinatorial  library  using  one 
of  the  loops  (loop  2)  of  the 
cyclotide  MCoTI-l  as  molecular 
scaffold  in  E.  coli  cells.  For  this 
purpose  we  have  used  two 
different  expression  vectors: 
pTXBI  and  pBAD24.  The  Mxe 
Gyrase  intein  was  subcloned 
into  pBAD24  following  the 
procedure  depicted  in  Figure  5. 

Genetically-encoded  cyclotide- 
based  libraries  were  generated 
at  the  DNA  level  using  double 
stranded  (ds)  DNA  inserts  with 
degenerate  sequences  for  the 
loop  2  of  cyclotide  MCoTI-l  in 
combination  with  PCR.  Briefly, 
a  long  degenerate  synthetic 
oligonucleotide  (which  encodes 
the  whole  cyclotide,  *  100  nt 
long  and  the  degenerate  loop) 
template  is  PCR  amplified  using 
5’-  and  3’-primers 
corresponding  to  the  non 
degenerate  flanking  regions 
(Fig.  6A).  The  degenerate 
synthetic  oligonucleotide 
template  will  be  synthesized 
using  a  NN(G/T)  codon  scheme 
for  the  randomized  loops.  This 
scheme  uses  32  codons  to 
encode  all  20  amino  acids 
and  encodes  only  1  stop 
codon.  The  theoretical  for 
such  library  is  205  or  3.2  x  106 
sequences  The  resulting 
double  stranded  degenerate 
DNA  was  double  digested  and 
then  ligated  to  a  linearized 
intein-containing  expression 
vector  to  produce  a  library  of 
pTXBI  -  or  pBAD24-based 
plasmids  (Fig.  6B).  These 
libraries  were  transformed  into 
electrocompetent  E.  coli  cells 
to  finally  obtain  a  library  of 
cells  («106).  Characterization 
of  the  libraries  was  carried  by 
picking  and  sequencing  of  50 
different  clones.  The 
characterization  of  the  library, 
where  loop  2  was  completely 
randomized,  revealed  a 


Figure  4.  NMR  analysis  of  the  backbone  dynamic  of  free  and  trypsin  bound  MCoTI-l.  a) 
{15N,  ^JNMR  heteronuclear  single  quantum  correlation  (HSQC)  spectrum  of  free  MCoTI- 
I.  Chemical  shift  assignments  of  the  backbone  amides  are  indicated,  b)  Overlay  of  the 
{15N,1H}  HSQC  spectra  of  free  (black)  and  trypsin  bound  MCoTI-l  (red).  Residues  with 
large  average  amide  chemical  shift  differences  between  two  different  states  (>0.3  ppm) 
are  indicated.  Peaks  that  are  broadened  in  trypsin  bound  MCoTI-l  are  indicated  by  grey 
circles,  c)  Average  amide  chemical  shift  difference  for  all  the  assigned  residues  in  free 
and  trypsin  bound  MCoTI-l. 
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complexity  of  =0.9x10®  different  unique  cyclotide  sequences,  very  close  to  the  theoretical  maximum  value 
(3.4x10®)  (Fig.  7).  This  library  has  been  cloned  into  two  different  expression  plasmids  with  different  promoters, 
origin  of  replication  and  antibiotic  resistance  to  allow  the  screening  of  the  libraries  using  cell-based  reporters 
encoded  in  orthogonal  plasmids.  These  include  T7-  and  arabinose-driven  E.  coli 


Figure  6.  Molecular  approach  to  build  MCoTI-l-based  genetically  encoded  libraries  using  loop  2  (A)  into  orthogonal 
plasmids  pTXBI  and  pBAD24-lntein  (B). 
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Figure  7.  Genomic  characterization  of  the  MCoTI-library  where  loop  2  was  completely  randomized 
using  an  NNK  scheme.  A.  DNA  sequences  of  loop  2  for  60  different  clones  randomly  chosen  from  the 
from  a  MCoTI-l  loop  2  based  library  using  the  pTXBI  plasmid.  B.  Graphical  representation  of  the  amino 
acid  composition  of  the  60  clones  sequenced.  The  graph  was  produced  using  WebLogo  3 

(http://webloao.threeDlusone.com/). 
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expression  plasmids  (Fig.  5),  and  are  compatible  with  the  plasmids  that  will  be  used  to  express  the  cell- 
reporters  (see  below).  We  estimated  that  =80%  of  the  loop  2-randomized  cyclotides  were  able  to  adopt  a 
native  cyclotide  fold  (Fig.  8). 


Figure  8.  HPLC  analytical  traces  of  the  cyclization/folding  crudes  for  individual  clones  isolated  from  the  MCoTI-l  loop  2  library 
expressed  using  plasmid  pTXBI  in  E.  coli before  and  after  purification  using  trypsin-agarose  beads.  Folded  cyclotides  were 
correctly  characterized  by  ES-MS  (mass  spectrometry)  and  their  ability  to  bind  trypsin. 
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We  have  also  shown  that  the  use  of  modified  protein  splicing  units  can  be  also  used  for  the  generation  of  cell- 
based  libraries  using  other  cyclic  molecular  scaffolds  thus  opening  the  possibility  of  using  alternative  templates 
for  the  production  of  therapeutics  able  to  inhibit  molecular  interactions  relevant  to  prostate  cancer.  These 
include  the  Bowman-Birk  inhibitor  SFTI-1  [9]  and  backbone  cyclized  a-defensins  (manuscript  in  preparation). 

2)  Cell-based  reporter  to  screen  in-cell  RING-mediated  Hdm2/HdmX  interactions.  We  have  generated  a 
fluorescent  reporter  to  screen  RING-Mdm2/RING-MdmX  interaction.  The  principle  for  this  approach  is  depicted 
in  Figure  9.  Our  FRET-based  reporter  system  uses  a  CyPet  and  YPet  fluorescent  proteins,  which  are  fused  the 
N-terminus  of  the  RING  domains  of  Mdm2  (429-491  aa)  and  MdmX  (427-490  aa),  respectively.  The  N-terminal 
region  of  the  Mdm2/X  RING  domains  can  easily  tolerate  the  addition  of  different  protein  domains  or  protein 
fragments  without  altering  their  heterodimerization  and  biological  function.  For  example  the  Wahl  group  has 
shown  that  half-luciferase  fragments  can  be  added  without  affecting  the  ability  of  Hdm2  and  Hdmx  RING 
domains  to  heterodimerize.  Moreover,  to  prevent  any  potential  steric  hindrance  that  could  interfere  with  the 
molecular  recognition  process,  we  initially  introduced  the  flexible  linker  [Gly-Gly-Ser]5  between  the  interacting 
proteins  or  protein  domains  and  the  corresponding  fluorescent  proteins 

Both  fluorescent  protein  constructs  were  successfully  cloned  and  expressed  in  E.  coli.  As  expected,  CyPet- 
Mdm2  and  Ypet-MdmX  were  biologically  active  in  an  in  vitro  fluorescence-binding  assay  using  fluorescence 
resonance  emission  transfer  (FRET)  (Fig.  9A  and  9B).  When  CyPet-MdmX  was  titrated  with  increasing 
amounts  of  YPet-Mdm2  the  FRET  signal  increased  until  saturation  of  the  complex  was  achieved  (Fig.  9B). 
Using  these  data  we  were  able  to  plot  a  binding  isotherm  for  the  RING-mediated  Mdm2/MdmX  interaction  and 
calculate  the  dissociation  constant  of  this  complex  (K6  =  220  ±  50  nM,  Fig.  9B  inset).  We  are  also  planning  to 
titrate  CyPet-Mdm2  with  YPet-MdmX  to  confirm  the  value  of  the  affinity  constant  between  the  two  interacting 
RING  domains  (work  in  progress).  As  shown  in  Figure  9C,  we  were  also  able  to  follow  the  inhibition  of  the 
interaction  by  fluorescence.  This  was  accomplished  by  titrating  a  solution  containing  the  CyPet-MdmX/YPet- 


Mdm2  complex  with  ethylenediaminetetraacetic  acid  (EDTA),  a  well  know  inhibitor  of  protein  interactions 
mediated  by  proteins  containing  transition  metals.  RING  domains  require  2  atoms  of  Zn  for  correctly  folding. 
Our  data  clearly  indicates  that  the  RING-based  YPet/CyPet  FRET  reporter  can  be  efficiently  used  in  vitro  to 
screen  antagonists  against  this  heteromolecular  complex. 


Figure  9.  FRET-based  reporter  for  screening  Mdm2/MdmX  RING-mediated  Interactions.  A.  Principle  for  the 
FRET-based  reporter  to  screen  for  antagonists  against  the  RING-mediated  Mdm2/MdmX  heteronuclear 
complex.  The  formation  of  the  complex  brings  in  close  proximity  fluorescent  proteins  YPet  and  CyPet.  Excitation 
of  CyPet  with  blue  light  allows  the  transfer  of  energy  to  the  yellow  fluorescent  protein  YPet  producing  yellow 
fluorescence.  Inhibition  of  the  Mdm2/MdmX  interaction  prevents  the  transfer  of  energy  from  the  blue  fluorescent 
protein  CyPet  thus  giving  only  blue  fluorescence.  B.  Titration  of  CyPet-MdmX  with  increasing  amounts  of  YPet- 
Mdm2.  The  formation  of  the  heteronuclear  complex  can  be  followed  by  the  increase  in  yellow  fluorescence. 
Plotting  the  increase  increase  in  yellow  fluorescence  versus  the  concentration  of  YPet-Mmd2  (inset)  provides  a 
binding  isotherm  that  allows  the  determination  of  the  affinity  constant  between  the  RING  domains  of  Mdm2  and 
MdmX.  C.  The  destruction  of  the  Mdm2/MdmX  complex  by  adding  increasing  amounts  of  EDTA  can  be  also 
flowed  by  the  decrease  in  yellow  fluorescence. 
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Next  we  tested  the  ability  to  use  it  inside  living  E.  coli  cells.  For  that  purpose  the  FRET-pairs  CyPet- 
MdmX/YPet-Mdm2  or  CyPet-Mdm2/YPet-MdmX  were  cloned  into  a  pRSF  duet  vector  and  expressed  in  E.  coli 
cells  (Fig.  10).  The  pRSF  duet  allows  poly-cistronic  expression  of  the  corresponding  YPet-CyPEt  FRET  pair, 
and  is  orthogonal  to  the  pBAD  and  pTXBI  vectors  (both  used  for  the  construction  of  the  cyclotide-based 
libraries).  As  shown  in  Figure  10,  when  the  Fldm2-Fldmx  RING  heterodimer  is  formed  in  vivo  the  pair  CyPEt- 
YPet  exhibits  high  FRET  signal  indicating  the  formation  of  the  complex.  Background  FRET  signal  was 
evaluated  using  cells  expressing  YPet-MdmX  and  the  CyPet-Mdm2  mutant  (C464A).  This  mutation  in  the 
RING  domain  of  Mdm2  prevents  Hdm2-Hdmx  heterodimerization  and  provides  the  background  FRET  signal 
when  the  Mdm2/MdmX  complex  is  not  formed.  We  also  tested  the  FRET  background  using  a  cell  line  co¬ 
expressing  the  RING  domain  of  BARD-I  (25-189)  fused  to  YPet  at  its  N-terminus  with  CyPET-MdmX  as  these 
two  RING  domains  do  not  interact  with  each  other.  In  both  cases  the  background  FRET  signal  when  the  RING 
heterodimerization  was  prevented  (i.e.  FRET-off  state)  was  200%  smaller  that  of  the  positive  control  using 
CyPet-MdmX/YPet-Mdm2  or  CyPet-Mdm2/YPet-MdmX  (i.e.  FRET-on  state)  (Fig.  10B).  Analysis  by 


fluorescence  activated  cell  sorting  (FACS)  also  revealed  that  both  populations  of  cells,  i.e.  FRET-on  and 
FRET-off  cells  can  be  easily  separated  by  FACS  (Fig.  IOC).  We  anticipate  that  this  in  vivo  FRET-reporter  will 
allow  the  easy  separation  of  positive  clones  by  FACS.  AT  this  time  we  are  also  exploring  to  reduce  the  size  of 
the  [Gly-Gly-Ser]  linker  used  to  separate  the  fluorescent  and  RING  domains  to  improve  the  FRET  signal  (work 
in  progress). 


Figure  10.  In-cell  FRET-based  reporter  to  screen  inhibitors  against  RING-mediated  Mdm2/MdmX  interaction.  A. 
Cloning  and  co-expression  of  YPet-Mdm2  and  CyPet-MdmX  fluorescent  constructs  on  E.  coli.  YPet-Mdm2  and 
CyPet-MdmX  were  cloned  into  a  pRSF  duet  expression  vector.  The  resulting  vector  was  used  to  transform  E.  coli 
cells  and  both  proteins  were  expressed  at  room  temperature  for  18  h.  Purified  proteins  were  analyzed  by  SDS- 
PAGE.  B.  Fluorescence  spectra  of  live  E.  coli  cells  expressing  YPet-Mdm2  /  CyPet-MdmX  (FRET-on),  YPet-Mdm2 
(C464A)  /  CyPet-MdmX  (FRET-off),  and  YPet-BARD-l  /  CyPet-MdmX  (FRET-off).  Excitation  to  quantify  FRET  signal 
was  performed  at  414  nm.  Inset:  quantification  of  fluorescent  protein  YPet  in  live  E.  coli  cells  expressing  FRET- 
reporter  indicates  that  the  differences  observed  in  FRET  signal  are  not  due  to  different  expression  levels  of  the  YPet 
fusion  protein.  Cells  were  excited  at  490  nm  for  YPet  qualification.  C.  Analysis  by  FACS  of  live  E.  coli  cells 
expressing  YPet-Mdm2  /  CyPet-MdmX  (FRET-on);  and  YPet-Mdm2  (C464A)  /  CyPet-MdmX  (FRET-off).  The  RING 
domains  of  Mdm2  (C464A)  and  BARD-I  do  not  interact  with  the  RING  domain  of  MdmX  and  were  used  as  negative 
controls  to  evaluate  the  background  fluorescence  of  the  FRET-off  state. 
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3)  Cellular  uptake  of  MCoTI-cyclotides. 

MCoTI-cyclotides  have  been  shown  recently  to  be  able  to  enter  human  macrophages  and  breast  cancer 
cell  lines  [1 1].  Internalization  into  macrophages  was  shown  to  be  mediated  mainly  through  macropinocytosis, 
a  form  of  endocytosis  that  is  actin-mediated  and  results  in  formation  of  large  vesicles  termed  macropinosomes 
[12,  13].  It  should  be  noted,  however,  that  in  this  study  the  visualization  of  MCoTI-ll  uptake  was  done  in  fixed 
and  not  in  live  cells.  Analysis  of  live  cells  provides  the  ability  to  visualize  events  in  real  time  without  the 
possible  complications  of  fixation  artifacts  that  have  confounded  interpretations  of  the  uptake  of  Tat  and  other 
related  peptides  for  instance  [14,  15].  As  well,  macropinocytosis  is  a  dominant  mechanism  for  endocytic  uptake 
in  macrophages  [16-18],  unlike  other  cells  that  are  not  specialized  for  large  scale  sampling  of  extracellular  fluid 


and  which  use  multiple  alternative  endocytic  mechanisms.  These  mechanisms  can  include  clathrin-mediated 
endocytosis,  caveolar  endocytosis,  macropinocytosis,  phagocytosis,  flotillin-dependent  endocytosis,  as  well  as 
multiple  other  as  yet  under-characterized  mechanisms  [19,  20].  Intrigued  by  these  results,  we  explored  the 
cellular  uptake  of  site-specific  fluorescent-labeled  MCoTI-cyclotides  and  studied  the  cellular  uptake 
mechanisms  in  HeLa  cells  using  live  cell  imaging  by  confocal  fluorescence  microscopy. 

We  have  reported  for  the  first  time  the  cellular  uptake  of  MCoTI-cyclotides  monitored  by  real  time  confocal 
fluorescence  microscopy  imaging  in  live  HeLa  cells  (manuscript  under  revision,  Appendix:  paper  #1).  Our 
results  clearly  show  that  HeLa  cells  readily  internalize  fluorescently-labeled-MCoTI-l.  We  found  that  this 
process  is  temperature-dependent  and  can  be  reversibly  inhibited  at  4°C,  which  indicates  an  active 
mechanism  of  internalization.  The  internalized  cyclotide  also  seems  to  colocalize  in  live  cells  with  multiple 
endocytic  markers  including,  to  the  greatest  extent,  the  fluid-phase  endocytic  marker  dextran  (10  KDa 
dextran,10K-Dex).  Internalized  MCoTI-l  was  colocalized  to  a  lesser  extent  with  the  cholesterol/lipid  dependent 
endocytic  marker  cholera  toxin  B  (CTX-B)  and  the  clathrin-mediated  endocytic  marker,  EGF.  Internalized 
MCoTI-l  was  localized  within  a  fairly  rapid  time  course  with  late  endosomal  and  lysosomal  compartments  which 
engaged  in  rapid  and  directed  movements  suggestive  of  cytoskeletal  involvement.  MCoTI-l  uptake  in  HeLa 
cells  was  not  impaired  by  Latrunculin  B  (Lat  B),  a  well-known  inhibitor  of  macropinocytosis.  Altogether,  these 
data  seem  to  indicate  that  MCoTI-l  cyclotide  is  capable  of  internalization  in  live  cells  through  multiple  endocytic 
pathways  that  may  be  dominant  in  the  particular  cell  type  under  study.  The  lack  of  strong  preference  for 
MCoTI-l  internalization  via  a  specific  cellular  internalization  pathway  is  of  significant  value  since  the  lack  of 
endogenous  affinity  for  a  particular  pathway  can  enable  the  ready  re-targeting  by  introduction  of  targeting 
peptides  within  the  scaffold  that  may  enable  specific  and  targeted  endocytic  uptake  to  a  particular  target  cell. 

At  the  same  time,  the  ready  uptake  of  MCoTI-l  by  multiple  pathways  suggests  accessibility,  in  the  untargeted 
form,  to  essentially  all  cells. 

In  order  to  study  the  cellular  uptake  of  MCoTI-cyclotides,  we  decided  to  use  MCoTI-l.  MCoTI-l  contains  only 
one  Lys  residue  located  in  loop  1  versus  MCoTI-ll,  which  contains  three  Lys  residues  in  the  same  loop  (Fig. 

11).  The  presence  of  only  one  Lys  residue  facilitates  the  site-specific  introduction  of  a  unique  fluorophore  on 
the  sequence  thus  minimizing  any  affect  that  the  introduction  of  this  group  may  have  on  the  cellular  uptake 

properties  of  the  cyclotide  (see 
paper  #1  in  the  Appendix  section 
for  a  detailed  description  of  the 
materials  and  methods) 

Folded  MCoTI-l  cyclotide  was 
produced  either  by  recombinant 
or  synthetic  methods.  In  both 
cases  the  backbone  cyclization 
was  performed  by  an 
intramolecular  native  chemical 
ligation  (NCL)  [21-24]  using  the 
native  Cys  located  to  the 
beginning  of  loop  6  to  facilitate 
the  cyclization.  This  ligation  site 
has  been  shown  to  give  very 
good  cyclization  yields  [8,  25]. 
Intramolecular  NCL  requires  the 
presence  of  an  N-terminal  Cys 
residue  and  C-terminal  a- 
thioester  group  in  the  same  linear 
precursor  [23,  26].  In  the 
biosynthetic  approach,  the 
MCoTI-l  linear  precursor  was 
fused  in  frame  at  their  C-  and  N- 
terminus  to  a  modified  Mxe 
Gyrase  A  intein  and  a  Met 
residue,  respectively  and 
expressed  in  Escherichia  co//'[9]. 
This  allows  the  generation  of  the 


Figure  11.  Primary  and  tertiary  structures  of  MCoTI  and  kalata 
cyclotides.  The  structures  of  MCoTI-ll  (pdb  ID:  1IB9  [1]  and  kalata  B1  (pdb 
ID:  1NB1  [5])  are  shown.  Conserved  cysteine  residues  and  disulfide  bonds 
are  shown  in  yellow.  An  arrow  marks  residue  Lys4  located  at  loop  1  in  MCoTI- 
cyclotides.  This  residue  in  MCoTI-l  was  used  for  the  site-specific  conjugation 
of  AlexaFluor488  N-hydroxysuccinimide  ester  (AF488-OSu)  through  an  stable 
amide  bond. 
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required  C-terminal  thioester  and  N-terminal  Cys  residue  after  in  vivo  processing  by  endogenous  Met 

aminopeptidase  (MAP)  [27,  28].  Cyclization 
and  folding  can  be  accomplished  very 
efficiently  in  vitro  by  incubating  the  MCoTI-l 
intein  fusion  construct  in  sodium  phosphate 
buffer  at  pH  7.4  in  the  presence  of  reduced 
glutathione  (GSH).  Biosynthetic  MCoTI- 
cyclotides  generated  this  way  have  been 
shown  to  adopt  a  native  folded  structure  by 
NMR  and  trypsin  inhibitory  assays  [10,  25, 
28], 

Natively  folded  MCoTI-ll  has  been  already 
successfully  produced  using  Fmoc-based 
solid-phase  peptide  synthesis  [29,  30]. 
Encouraged  by  these  results  we  also 
explored  the  production  of  MCoTI-l  by 
chemical  synthesis  (Fig.  12).  For  this 
purpose  the  MCoTI-l  linear  precursor  a- 
thioester  was  assembled  by  Fmoc-based 
solid-phase  peptide  synthesis  on  a 
sulfonamide  resin  [31, 32]  (Fig.  12A). 
Activation  of  the  sulfonamide  linker  with 
iodoacetonitrile  followed  by  cleavage  with 
ethyl  mercaptoacetate  and  acidolytic 
deprotection  with  TFA  provided  the  fully 
protected  linear  peptide  a-thioester  (Fig. 

12B).  The  synthetic  linear  precursor  thioester 
was  then  efficiently  cyclized  and  folded  in 
one-pot  reaction  using  sodium  phosphate 
buffer  at  pH  7.5  in  the  presence  of  2  mM 
GSH.  The  reaction  was  complete  in  18  h  and 
the  folded  product  was  purified  by  reverse- 
phase  HPLC  and  characterized  by  ES-MS. 
The  expected  mass  for  folded  MCoTI-l  was 
in  agreement  with  a  folded  structure 
(Expected  mass  =  3480.9  Da;  measured  = 
3481 .0  ±  0.4  Da).  Synthetic  folded  MCoTI-ll 
was  also  shown  to  co-elute  by  HPLC  with 
recombinant  natively  folded  MCoTI-l  (data 
not  shown).  The  biological  activity  of 
synthetic  MCoTI-l  was  assayed  by  using  a 
trypsin  pull-down  experiment  [9,  25].  As 
shown  in  Figure  2B,  synthetic  folded  MCoTI-l  was  specifically  captured  from  a  cyclization/folding  crude 
reaction  by  trypsin-immobilized  Sepharose  beads  [8,  9,  25],  thus  indicating  that  was  adopting  a  native  cyclotide 
fold. 

Purified  MCoTI-l  was  site-specifically  labeled  with  AlexaFluor  488  (AF488)  for  live  confocal  imaging.  The  a- 
amino  group  of  Lys4  residue  located  in  loop  1  was  conjugated  to  AF488-NHS  in  sodium  phosphate  buffer  at  pH 
7.5  for  2  h  (Fig.  13A).  Under  these  conditions  the  main  product  of  the  reaction  was  mono-labeled  AF488- 
MCoTI  as  characterized  by  HPLC  and  ES-MS  (expected  average  mass  =  3997.9  Da;  measured  =  3997.4  ±  0.3 
Da)  (Figs  13C).  AF488-labeled  MCoTI-l  was  then  purified  by  reverse-phase  HPLC  to  remove  any  trace  of 
unreacted  materials  (Fig.  13B). 

In  order  to  infer  the  correct  conclusions  regarding  data  obtained  on  the  cellular  uptake  of  native  MCoTI-l 
when  using  modified  cyclotides,  like  AF488-MCoTI-l  for  example,  it  is  critical  to  be  sure  that  they  still  adopt 
structures  similar  to  that  of  the  native  form.  MCoTI-cyclotides  are  extremely  stable  to  chemical  and  thermal 
denaturation,  and  they  have  been  shown  to  be  able  to  withstand  procedures  like  reverse-phase 
chromatography  in  the  presence  of  organic  solvents  under  acidic  conditions  without  affecting  their  tertiary 


Figure  12.  Chemical  synthesis  of  MCoTI-l  (A)  Synthetic 
scheme  used  for  the  chemical  synthesis  of  cyclotide  MCoTI-l  by 
Fmoc-based  solid-phase  peptide  synthesis  (B)  Analytical 
reverse-phase  HPLC  traces  of  MCoTI-l  linear  precursor  a- 
thioester,  cyclization/folding  crude  and  purified  MCoTI-l  by  either 
affinity  chromatograpgy  using  trypsin-immobilized  Sepharose 
beads  or  semipreparative  reverse-phase  HPLC.  HPLC  analysis 
was  performed  in  all  the  cases  using  a  linear  gradient  of  0%  to 
70%  buffer  B  over  30  min.  Detection  was  carried  out  at  220  nm. 
An  arrow  indicated  the  desired  product  in  each  case. 
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structure  [8,  10,  28-30,  33].  It  is  also  unlikely  that  the  acylation  of  the  a-amino  group  of  Lys4  in  MCoTI-l  may 
disrupt  the  tertiary  structure  of  this  cyclotide.  Craik  and  co-workers  have  previously  shown  that  biotinylation  of 
the  three  Lys  residues  located  in  loop  1  in  MCoTI-ll,  including  Lys4  (Fig.  11)  does  not  disrupt  the  native 
cyclotide  fold  of  this  cyclotide  as  determined  by  ^-NMR  [1 1],  We  have  also  recently  shown  that  mutation  of 
residue  Lys4  by  Ala  does  not  seem  to  affect  the  ability  of  this  mutant  to  adopt  a  native  cyclotide  fold,  thus 
indicating  that  the  presence  of  positive  charge  residue  in  this  position  is  not  critical  for  the  tertiary  structure  of 
MCoTI-l  [25].  Similar  findings  have  been  also  found  by  Leatherbarrow  and  coworkers,  where  mutation  of  this 
residue  by  Phe  or  Val  was  still  able  to  render  MCoTI-cyclotides  able  to  fold  correctly  and  have  inhibitory  activity 
against  chymotrypsin  and  human  elastase,  respectively  [29],  Altogether  these  facts  suggest  that  residue  Lys4 
is  not  critical  for  adopting  the  native  cyclotide  fold  or  disturbing  the  tertiary  structure  of  MCoTI-cyclotides. 


To  study  the  cellular  uptake  of 
AF488-MCoTI-l  we  used  HeLa 
cells.  The  internalization  studies 
were  all  carried  out  with  25  pM 
AF488-MCoTI-l.  This  concentration 
provided  a  good  signal/noise  ratio 
for  live  cell  confocal  fluorescence 
microscopy  studies  and  did  not 
show  any  cytotoxic  effect  on  HeLa 
cells.  This  is  in  agreement  with  the 
cellular  tolerance  of  wild-type  and 
biotinylated  MCoTI-ll  reported  for 
other  types  of  human  cell  lines  [11]. 
First,  we  analyzed  the  time  course 
of  changes  in  cellular  distribution 
following  uptake  of  25  pM  AF488- 
MCoTI-l  by  incubating  with  the 
cyclotide  for  1  hr  and  then 
analyzing  its  distribution  after  1, 2, 

4,  8  and  10  h.  As  shown  in  Figure 
4,  the  internalized  cyclotide  was 
clearly  visible  within  perinuclear 
punctate  spots  inside  the  cells  after 
1  h  incubation.  Observation  of  cells 
pulsed  with  AF488-MCoTI-l  for  one 
hour  and  then  incubated  for  longer 
periods  of  time  in  the  absence  of 
cyclotide  did  not  show  any  evidence  for  decreased  intracellular  fluorescence,  while  the  largely  perinuclear 
distribution  of  internalized  MCoTI-l  appeared  comparable  at  all  time  points.  Similar  results  have  been  also 
been  recently  reported  on  the  internalization  of  biotinylated  MCoTI-ll  in  macrophage  and  breast  cancer  cell 
lines  [11],  these  studies  however,  used  fixed  cells  to  visualize  the  internalized  cyclotide. 

In  order  to  study  the  mechanism  of  internalization  of  AF488-MCoTI-l  in  live  HeLa  cells,  we  first  explored  the 
effect  of  temperature  on  the  uptake  process.  Active  and  energy-dependent  endocytic  mechanisms  of 
internalization  are  inhibited  at  4°C  [34],  The  internalization  of  AF488-MCoTI-l  was  totally  inhibited  after  a  1  h 
incubation  at  4°C  (Fig.  15).  This  inhibition  was  completely  reversible  and  when  the  same  cells  were  incubated 
again  at  37°C  for  1  h,  the  punctate  intracellular  fluorescence  labeling  pattern  was  restored.  This  result 
confirmed  that  the  uptake  of  AF488-MCoTI-l  in  HeLa  cells  follows  a  temperature  dependent  active  endocytic 
internalization  pathway.  It  should  be  noted  that  no  significant  surface  binding  was  detected  at  4°C,  suggesting 
that  MCoTI-l  does  not  bind  a  surface  receptor,  even  nonspecifically.  This  is  in  agreement  with  studies  on  the 
MCoTI-ll  in  fixed  cells  so  both  the  MCoTI-l  and  MCoTI-ll  appear  to  lack  specific  affinity  for  proteins  or  lipids  in 
cell  membranes,  unlike  the  kalata  B1  cyclotide  which  shows  membrane  affinity  [11].  This  lack  of  endogenous 
affinity  for  a  specific  surface  receptor  or  membrane  constituent  makes  MCoTI-l  ideal  for  engineering  using 
more  specific,  receptor-directed,  peptide-based,  internalization  motifs,  within  the  scaffold,  that  might  enable 
members  of  this  family  to  have  targeting  enhanced  to  a  specific  cell  type. 

Next,  we  investigated  the  internalization  pathway  used  by  labeled-MCoTI-l  to  enter  HeLa  cells.  There  are 
several  known  and  well-characterized  mechanisms  of  endocytosis  [35].  It  is  also  now  well  established  that 


Figure  13.  Site-specific  labeling  of  MCoTI-l  with  AlexaFluor-488  N- 
hydroxysuccinimide  ester  (AF488-OSu).  (A)  Scheme  depicting  the 
bioconjugation  process  and  localization  of  the  fluorescent  probe  at  residue 
Lys4  in  loop  1 .  (B)  Analytical  reverse-phase  HPLC  trace  of  pure  AF488- 
MCoTI-l.  HPLC  analysis  was  performed  using  a  linear  gradient  of  0%  to 
70%  buffer  B  over  30  min.  Detection  was  carried  at  220  nm  (C)  ES-MS 
spectra  of  pure  AF488-MCoTI-l. 
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almost  all  cell-penetrating  peptides  (CPPs)  use  a  combination  of  different  endocytic  pathways  rather  than  a 
single  endocytic  mechanism  [35].  A  recent  study  showed  that  several  CPPs  (including 
Antennapedia/penetratin,  nona-Arg  and  Tat  peptides)  can  be  internalized  into  cells  by  multiple  endocytic 
pathways  including  macropinocytosis,  clathrin-mediated  endocytosis,  and  caveolae/lipid  raft  mediated 
endocytosis  [36].  To  investigate  if  that  was  the  case  with  the  internalization  of  AF488-MCoTI-l  in  HeLa  cells, 


we  decided  to  look  at  its  colocalization  with  various 
endocytic  markers  (Fig.  16).  lOK-Dex  has  previously 
been  used  as  a  marker  of  fluid-phase  endocytosis  [12, 
37-39].  CTX-B  has  been  used  as  a  marker  for  various 
lipid-dependent  endocytic  pathways  [20,  40],  while 
EGF  has  traditionally  been  a  marker  of  clathrin- 
mediated  endocytosis  [41-43].  As  shown  in  Figure  6, 
colocalization  studies  showed  that  after  1  h,  AF488- 
MCoTI-l  fluorescence  was  significantly  colocalized  with 
the  fluorescence  associated  with  lOK-Dex  (59  ±  4%  of 
total  cyclotide  fluorescence  pixels  were  colocalized 
with  lOK-Dex  fluorescent  pixels).  Less  colocalization 
was  observed  with  fluorescent  CTX-B  (39  ±  4  %)  and 
fluorescent  EGF  (21  ±  2  %).  This  data  seems  to 
suggest  that  AF488-MCoTI-l  is  primarily  entering  cells 
through  fluid-phase  endocytosis.  The  observed  traces 
of  colocalization  with  CTX-B  and  EGF  also  suggest 
that  AF488-MCoTI-l  could  be  using  alternative  or 
additional  endocytic  pathways.  The  colocalization 
results  could  also  be  attributed,  however,  to  the 
merging  of  endosomal  uptake  vesicles  generated  by 
different  pathways  at  the  level  of  an  early  endosome. 
To  address  whether  the  major  uptake  and 
colocalization  of  AF488-MCoTI-l  with  lOK-Dex  was 
due  to  cointernalization  by  macropinocytosis,  we 
explored  the  inhibition  of  AF488-MCoTI-l  uptake  by  Lat 
B,  a  potent  inhibitor  of  actin  polymerization,  which  is  an 
essential  element  of  macropinocytlsis  [44-47].  As 
shown  in  Figure  7,  Lat  B  did  not  significantly  inhibit 
uptake  of  AF488-MCoTI-l  (Fig.  17A)  nor  of  lOK-Dex 
(data  not  shown).  Treatment  of  HeLa  cells  with  this 
agent  resulted  in  a  total  disruption  of  the  actin  filament 
network  (Fig.  1 7B).  These  data  suggest  that 
macropinocytosis  is  not  responsible  for  uptake  of  either 
lOK-Dex  nor  AF488-MCoTI-l  in  HeLa  cells. 

As  an  extension  of  these  inhibition  studies,  cells  were 
also  treated  with  MBCD,  a  well-established  cholesterol- 
depleting  agent  employed  for  studying  the  involvement 
of  lipid  rafts/caveolae  in  endocytosis  [48,  49], 
Preliminary  studies  with  MBCD  suggested  no 
significant  inhibition  of  AF488-MCoTI-l  (data  not 
shown).  Since  the  extent  of  total  colocalization  of 
AF488-MCoTI-l  with  CTX-B  was  less  than  40%,  it  is 
unsurprising  that  no  marked  effect  was  seen  by  live 
cell  microscopy.  Taken  together,  these  results  seem 
to  suggest  that  the  uptake  of  AF488-MCoTI-l  in  HeLa 
cells  is  following  multiple  endocytic  pathways,  which  is 
in  agreement  with  what  has  been  recently  reported  for 
different  CPPs  [36]. 


Figure  14.  MCoTI-l  distribution  in  HeLa  cells.  HeLa 
cells  were  incubated  with  25  pM  MCoTI-l  for  1  hour, 
cyclotide  was  removed  with  gentle  rinsing  in  PBS  and 
then  the  cells  were  monitored  for  distribution  of 
intracellular  fluorescence  at  intervals  from  1-10  hours 
using  confocal  fluorescence  microscopy.  Bar  =  10pm. 


Next  we  explored  the  fate  of  the  endocytic  vesicles  containing  labeled  MCoTI-l.  There  are  at  least  two 
pathways  that  involve  the  cellular  trafficking  of  endosomal  vesicles.  The  degradative  pathway  includes  routing 


Figure  15.  Endocytosis  of  MCoTI-l  is 
temperature-dependent.  HeLa  cells  were 
incubated  with  25  pM  MCoTI-l  for  1  hr  at 
4°C.  After  removal  of  the  MCoTI-l- 
containing  media,  and  a  gentle  PBS  wash, 
the  cells  were  imaged.  Following  imaging, 
the  MCoTI-l-containing  media  was 
replaced  and  the  cells  incubated  at  37°C 
for  1  hr  and  imaged  again.  Bar  =  1 0  pm. 
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Figure  16.  Colocalization  of  MCoTI-l  with  markers  of 
endocytosis.  (A)  HeLa  cells  were  incubated  with  25  pM  MCoTI-l 
and  either  1  mg/ml  lOK-dextran  (lOK-Dex),  10  pg/ml  cholera 
toxin  B  (CTX-B),  or  400  ng/ml  epidermal  growth  factor  (EGF)  for 
1  hour  at  37  °C  as  described  in  Materials  and  Methods  and  then 
imaged.  Bar  =  10  pm.  (B)  Quantification  of  pixel  colocalization 
was  done  using  the  Zeiss  LSM  software  for  image  analysis  and 
measures  the  %  of  total  fluorescent  AF488  MCoTI-l  pixels  in  the 
ROI  relative  to  red  pixels  associated  with  different  endocytic 
markers,  (n  =  1 3  cells  for  1 0K-Dex,  n  =  11  cells  for  CTX-B  and  n 
=  10  cells  for  EGF,  with  cells  assessed  across  3  different 
experiments,  *  p  <  0.05  relative  to  1 0K-Dex,  #  p  <  0.05  relative  to 
CTX-B). 
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of  internalized  materials  from  early 
endosomes  via  late  endosomes  to 
lysosomes  where  degradation  of 
internalized  materials  occurs  within  the 
cells.  On  the  other  hand,  recycling 
endosomes  sort  material  internalized  into 
early  endosomes  and  are  responsible  for 
effluxing  internalized  material  back  to  the 
cellular  membrane  [50].  If  labeled-MCoTI-l 
was  localized  in  recycling  endosomes,  it 
would  be  expected  that  its  concentration  in 
the  cell  would  decrease  and/or  accumulate 
on  the  membrane  over  time,  which  was  not 
the  case  in  the  time  course  experiment 
following  the  cellular  fate  of  internalized 
cyclotide  (Fig.  14).  To  explore  the  potential 
localization  of  labeled-MCoTI-l  in 
lysosomes  we  first  used  LysoTracker  Red 
(LysoRed).  This  pH  sensitive  fluorescent 
probe  is  utilized  for  identifying  acidic 
organelles,  such  as  lysosomes  and  late 
endosomes,  in  live  cells.  As  shown  in 
Figure  8A,  significant  colocalization  (60  ± 

4.0%  as  determined  by  pixel  colocalization 
analysis)  of  LysoRed  and  AF488-MCoTI-l 
was  observed  after  treating  the  cells  for  1  h 
with  both  agents.  As  an  extension  of  these 
experiments,  we  also  investigated  the  colocalization  of  labeled-MCoTI-l  and  lysosomal-associated  membrane 
protein  1  (Lampl),  an  established  mature  lysosomal  marker  [51, 52],  For  this  experiment,  live  HeLa  cells  were 
first  infected  with  a  Red  Fluorescent  Protein  (RFP)-Lampl  -expressing  BacMam  virus.  The  next  day  the  cells 
were  incubated  with  AF488-MCoTI-l  for  1  h  and  imaged.  As  shown  in  Figure  8B,  colocalization  was  also  seen 
for  AF488-MCoTI-l  and  RFP-Lampl  (38  ±  5%,  as  determined  by  pixel  colocalization  analysis),  suggesting  that 
even  after  1  h,  significant  MCoTI-l  has  already  reached  the  lysosomal  compartments.  Our  data  suggest  that 
after  1  h,  a  significant  amount  of  MCoTI-l  (=40%)  has  trafficked  through  the  endosomal  pathway  to  the 
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lysosomes  and  that  =20%  is  already  localized  in 
late  endosomes  or  other  types  of  acidic 
organelles.  It  has  previously  been  reported  that 
the  perinuclear  steady-state  distribution  of 
lysosomes  is  a  balance  between  movement  on 
microtubules  and  actin  filaments  [53-55]. 
Likewise,  movement  from  early  endosomal 
compartments  to  late  endosomes  to  lysosomes 
has  also  been  shown  to  rely  on  the  microtubule 
network  [56,  57].  As  an  extension  of  these 


Figure  17.  Disruption  of  actin  does  not  inhibit 
MCoTI-l  uptake.  (A)  HeLa  cells  were  untreated 
(control)  or  treated  with  Lat  B  (2  pM)  for  30  min 
at  37°C  prior  to  addition  of  25  pM  MCoTI-l. 
Following  uptake  for  1  hr  at  37 °C,  the  cells  were 
imaged  using  confocal  fluorescence  microscopy. 
Bar  =  1 0  pm.  (B)  HeLa  cells  without  treatment 
(control)  or  treated  with  2  pM  Lat  B  for  30  min  at 
37 °C  were  fixed  and  labeled  with  rhodamine — 
phalloidin  to  label  actin  (red)  and  DAPI  to  label 
nuclei  (blue).  Bar  =  10  pm. 
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Figure  18.  MCoTI-l  is  colocalized  with  lysosomal 
compartments.  (A)  Untreated  or  BacMam-RFP-LampI 
treated  HeLa  cells  were  incubated  with  25  pM  MCoTI-l  and 
LysoTracker  Red,  or  MCoTI-l  alone,  for  1  hr  at  37 °C  as 
described  in  Materials  and  Methods  and  then  imaged.  Bar  = 
10  pm.  (B)  Quantification  of  %  of  total  fluorescent  AF488- 
MCoTI-l  pixel  colocalization  with  fluorescent  pixels 
associated  with  both  markers  was  done  using  the  Zeiss  LSM 
software  for  image  analysis,  (n  =  14  cells  for  LysoTracker 
Red  and  n  =  11  cells  for  Lampl  with  cells  selected  from  3 
separate  experiments). 
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experiments,  and  to  investigate  whether  MCoTI-l-containing 
vesicles  were  actively  trafficking  inside  the  cell,  we  captured 
time-lapse  video  of  cells  after  incubation  with  MCoTI-l  for  1 
h.  Indeed,  the  time-lapse  capture  showed  active 
movements  of  MCoTI-l-containing  vesicles  (Fig.  19). 
Directed  short-  and  long-range  movements  could  be  seen, 
characteristic  of  movement  on  cytoskeletal  filaments.  These 
results  suggest  that  while  a  large  portion  of  MCoTI-l  has 
reached  lysosomal  compartments  by  1  h,  and  some  of  the 
movements  seen  may  be  attributed  to  the  steady-state 
distribution  of  lysosomes,  the  remaining  cyclotide  may  still 
be  trafficking  through  the  cell  from  other  membrane 
compartments,  likely  within  late  endosomes. 

Craik  and  co-workers  have  recently  reported  the  uptake  of 
biotinylated-MCoTI-ll  by  human  macrophages  and  breast 
cancer  cell  lines  [11].  This  work  concluded  that  the  uptake 
of  MCoTI-ll  in  macrophages  is  mediated  by 


macropinocytosis  and  that  the  cyclotide  accumulates  in  macropinosomes  without  trafficking  to  the  lysosome. 
MCoTI-ll  shares  high  homology  with  MCoTI-l  (=97%  homology,  see  Fig.  11)  and  similar  biological  activity. 
Despite  their  similarities,  the  differences  in  the  cellular  uptake  and  trafficking  of  MCoTI-cyclotides  by 
macrophages  versus  HeLa  cells  could  be  attributed  to  the  cellular  differences  in  endocytic  preferences  for 
these  two  very  different  cell  types.  Macrophage  cells  are  specialized  in  large  scale  sampling  of  extracellular 
fluid  using  macropiniocytosis  as  the  dominant  endocytic  pathway.  Meanwhile,  other  types  of  cells  may  use 
multiple  endocytic  pathways  as  has  been  recently  shown  for  the  uptake  of  different  CPPs  in  HeLa  cells  [36]. 


At  this  point  we  cannot  be  certain  if  some  labeled  MCoTI-l  is  able  to  escape  from  endosomal/lysosomal 
compartments  into  the  cytosol.  The  ability  to  track  the  release  of  fluorescent-labeled  molecules  from  cellular 


vesicles  is  limited  using  live  cell  imaging  of  fluorescence  signal  primarily  due  to  the  large  dilution  effect  if  the 
molecule  is  able  to  escape  the  highly  confined  volume  of  the  vesicle  into  the  larger  cytosolic  volume.  One  way 
to  demonstrate  the  release  of  peptide  into  the  cytosol,  however,  would  be  by  using  labels  with  better  detection 
sensitivity  or  incorporating  a  biological  activity  that  can  be  measured  in  the  cellular  cytosol.  For  example,  the 
presence  of  Tyr  residues  in  both  MCoTI-cyclotides  should  facilitate  the  incorporation  of  radioactive  iodine  into 
the  phenolic  ring  of  Tyr  with  minimal  disruption  of  the  native  structure  of  the  cyclotide.  The  incorporation  or 
grafting  of  biological  peptides  into  the  MCoTI  scaffold  could  also  provide  proof  of  endosomal/lysosomal  escape 
if  such  biological  activity  could  be  measured  only  in  the  cytosol.  This  approach  has  been  already  used  to 
demonstrate  endosomal  escape  of  CPPs  such  as  the  TAT  peptide  [58,  59].  The  retention  of  fluorescence 
signal  in  the  perinuclear,  lysosomal  compartments  for  a  period  of  up  to  10  hrs  suggests  that  most  of  the 
cyclotide  remains  within  these  compartments  however;  given  the  flexibility  of  the  cyclotide  backbone  to 
accommodate  multiple  peptide  sequences,  subsequent  studies  may  explore  the  ability  of  targeting  and 
endosomolytic  sequences  for  concomitant  targeted  entry  and  endosomal/lysosomal  escape  into  cytosol. 


Figure  19.  MCoTI-l-containing  vesicles  are  in  motion.  HeLa  cells  were  incubated  with  25  pM  MCoTI-l 
for  1  hr  at  37 °C  and  then  imaged  using  time-lapse  microscopy  as  described  in  Materials  and  Methods. 
Arrows  indicate  position  of  the  moving  vesicle  at  0  min  while  displacement  of  the  fluorescent  vesicle 
relative  to  the  arrow  shows  the  extent  of  movement  over  time.  Bar  =  2  pm. 
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In  conclusion,  this  study  reports  on  the  first  analysis  of  intracellular  uptake  of  MCoTI-l  cyclotide  using  live  cell 
imaging  by  confocal  fluorescence  microscopy.  Cyclotides  represent  a  novel  new  platform  for  drug 
development.  Their  stability,  conferred  by  the  cyclic  cystine  knot,  their  small  size,  their  amenability  to  both 
chemical  and  biological  synthesis  and  their  flexible  tolerance  to  sequence  variation  make  them  ideal  for 
grafting  of  biologically-active  therapeutic  epitopes.  As  we  show  herein,  they  are  also  capable  in  the 
unmodified  state  of  utilizing  multiple  cellular  endocytic  pathways  for  internalization.  Their  ease  of  access 
makes  them  readily  accessible  in  their  current  state  to  endosomal/lysosomal  compartments  of  virtually  any  cell. 
Without  an  apparent  strong  preference  for  an  existing  cellular  pathway  nor  surface-expressed  epitope  in  HeLa 
cells  (nor  in  other  studies  with  MCoTI-ll  in  macrophages  [11]),  they  appear  highly  amenable  to  retargeting  to 
exploit  a  particular  target  cell’s  dominant  internalization  pathway  and/or  unique  surface  receptor  repertoire, 
along  with  the  targeted  introduction  of  biologically-active  therapeutic  motifs. 


Key  Research  Accomplishments  (2010-2011) 

•  We  have  reported  biosynthesis  of  a  genetically-encoded  library  of  MCoTI-based  cyclotides  containing  a 
comprehensive  suite  of  amino  acid  mutants.  The  mutagenesis  results  obtained  in  our  work  highlighs  the 
extreme  robustness  of  the  cyclotide  scaffold  to  mutations.  The  results  obtained  are  key  for  the  design  of 
larger  combinatorial  libraries  to  screen  RING  Mdm2/MdmX  antagonists  (Appendix  Section:  paper  #7). 


•  We  have  accomplished  the  biosynthesis  of  a  large  combinatorial  library  (=  106  members)  of  genetically- 
encoded  cyclotides  in  Escherichia  coli  using  loop  2  of  cyclotide  MCoTI-l  as  molecular  template.  This  is  the 
first  time  that  a  library  with  such  a  high  diversity  has  been  produced.  This  library  will  be  used  to  screen  RING 
Mdm2/MdmX  antagonists. 

•  We  have  accomplished  the  cloning  of  cyclotide-based  libraries  into  different  E.  coli  expression  plasmids 
with  different  promoters,  origin  of  replication  and  antibiotic  resistance  to  allow  the  screening  of  the  libraries 
using  cell-based  reporters  encoded  in  orthogonal  plasmids.  This  will  allow  in-cell  screening  of  genetically 
encoded  cyclotide-based  libraries. 

•  We  have  carried  out  the  first  ever  reported  study  of  the  backbone  dynamics  of  a  natively  folded  MCoTI-l 
cyclotide  in  the  free  state  and  complexed  to  its  binding  partner  trypsin  in  solution.  This  accomplishment  is 
critical  to  understand  the  dynamics  and  better  define  the  loops  that  are  best  amenable  for  randomization  in 
the  generation  of  MCoTI-based  libraries  (Appendix  Section:  paper  #5) 

•  We  have  developed  a  FRET-based  reporter  using  fluorescent  proteins  CyPet  and  YPet  able  to  detect 
both  in  vitro  and  in  vivo  protein-protein  antagonists  for  the  RING-mediated  Hdm2/HdmX  interaction.  We  are 
now  optimizing  the  constructs  to  maximize  the  fluorescence  signal.  This  accomplishment  is  key  for  the 
success  of  the  project. 

•  We  have  investigated  the  cellular  uptake  of  cyclotide,  MCoTI-l  in  live  HeLa  cells.  Using  real  time  confocal 
fluorescence  microscopy  imaging  we  show  that  MCoTI-l  is  readily  internalized  in  live  HeLa  cells  and  that  its 
endocytosis  is  temperature-dependent.  Endocytosis  of  MCoTI-l  in  HeLa  cells  is  achieved  primarily  through 
fluid-phase  endocytosis,  as  evidenced  by  its  significant  colocalization  with  lOK-dextran,  but  also  through 
other  pathways  as  well,  as  evidenced  by  its  colocalization  with  markers  for  cholesterol-dependent  and 
clathrin-mediated  endocytosis,  cholera  toxin  B  and  EGF  respectively.  Uptake  does  not  appear  to  occur  via 
macropinocytosis  as  inhibition  of  this  pathway  by  Latrunculin  B-induced  disassembly  of  actin  filaments  did 
not  affect  MCoTI-l  uptake.  As  well,  a  significant  amount  of  MCoTI-l  accumulates  in  late  endosomal  and 
lysosomal  compartments  and  MCoTI-l-containing  vesicles  continue  to  exhibit  directed  movements.  These 
findings  demonstrate  internalization  of  MCoTI-l  through  endocytic  pathways  that  are  dominant  in  the  cell  type 
investigated,  suggesting  that  this  cyclotide  has  ready  access  to  general  endosomal/lysosomal  pathways  but 
could  readily  be  re-targeted  to  specific  receptors  through  addition  of  targeting  ligands  (Appendix  Section: 
paper  #1) 

•  We  have  also  successfully  accomplished  the  chemical  synthesis  of  wild-type  and  chemically  modified 
MCoTI-cyclotides.  This  is  key  for  the  molecular  characterization  of  biomolecular  intereactions  and  for  further 
development  of  cyclotide  leads  to  improve  their  pharmacokinetic  properties  (Appendix  Section:  paper  #  1). 


Reportable  Outcomes 

Peer-reviewed  Publications  Submitted  and  Published: 

•  J.  Contreras,  A.  Y.  O.  Elnagar,  S.  Hamm-Alvarez  and  J.  A.  Camarero  (2011)  Cellular  Uptake  of 
cyclotide  MCoTI-l  follows  multiple  endocytic  pathways,  J.  Control  Release,  under  revision  (Appendix:  paper 
#1). 


J.  A.  Camarero  (2011)  Legume  cyclotides  shed  new  light  on  the  genetic  origin  of  knotted  circular 
proteins,  Proc.  Natl.  Acad.  Sci.  USA,  in  press  (Appendix:  paper  #2). 

L.  Berrade,  Angie  E.  Garcia  and  J.  A.  Camarero  (2010)  Protein  Microarrays:  Novel  Developments  and 
Applications,  Pharmacol.  Res.,  DOI  10.1 007/s  1 1095-010-0325-1  (Appendix:  paper  #3). 

A.  E.  Garcia  and  J.  A.  Camarero  (2010)  Biological  Activities  of  Natural  and  Engineered  Cyclotides,  a 
Novel  Molecular  Scaffold  for  Peptide-Based  Therapeutics,  Curr.  Mol.  Pharmacol.,  3(3),  153-163  (Appendix: 
paper  #4) 

S.  S.  Puttamadappa,  K.  Jagadish,  A.  Shekhtman  and  J.  A.  Camarero  (2010)  Backbone  dynamics  of 
cyclotide  MCoTI-l  free  and  complexed  with  trypsin,  Angew.  Chem.  Int.  Ed.,  49(39),  7030-7034  (Appendix: 
paper  #5) 


•  K.  Jagadish  and  J.  A.  Camarero  (2010)  Cyclotides,  a  promising  molecular  scaffold  for  peptide-based 
therapeutics,  Biopolymers,  94(5),  61 1-616  (Paper  #6). 

•  J.  Austin,  Wan  Wang,  Swamy  Puttamadappa,  Alexander  Shekhtman  and  J.  A.  Camarero  (2010) 
Biosynthesis  and  biological  screening  of  a  genetically-encoded  library  based  on  the  cyclotide  MCoTI-l, 
ChemBioChem,  10(16),  2663-2670  (Paper  #7). 

Oral  presentations: 

•  Seventh  Annual  PEGS  (Protein  Engineering  Summit)  Conference,  Phage  and  Yeast  Display  of 
Antibodies  and  Proteins  Session,  May  9-13,  2011,  Boston. 

•  201 1  Spring  ACS  National  Meeting,  Division  of  Biological,  Invited  Lecture  to  the  Ralph  F.  Hirschmann  Award 
in  Peptide  Chemistry:  Symposium  in  Honor  of  David  J.  Craik,  Anaheim,  March  28,  201 1 ,  Anaheim. 

•  University  of  Uppsala,  Invited  seminar  to  The  Svedberg  Lecture  Series,  March  24,  201 1 ,  Uppsala, 
Sweden. 

•  Pacifichem  2010:  International  Chemical  Congress  of  Pacific  Basic  Societies,  December  15-20, 
Honolulu,  Hawaii. 

•  Roche  Colorado  Corporation  Peptide  Symposium  (RCCPS)  2010,  Cyclotides,  a  novel  natural  peptide 
scaffold  for  drug  discovery,  September  14-16,  Boulder,  Colorado. 

•  Natural  Peptides  to  Drugs  (NP2D)  4th  International  Congress,  April  11-14,  2010,  Zermatt,  Switzerland. 

Patent  Applications 

•  Composition  and  methods  for  the  rapid  biosynthesis  and  in  vivo  screening  of  biologically  relevant 
peptides,  J.  A.  Camarero,  PCT  International  Application  No.  PCT/US201 0/039720. 

•  Novel  cyclotide-based  polypeptides  with  antiviral  and  anticancer  activity,  J.  A.  Camarero  and  J.  Jung, 
filed  US  patent  application  #61/283,096. 


Conclusion 

The  results  accomplished  during  the  first  12  months  of  the  proposal  are  extremely  encouraging.  We  have  shown 
that  protein  splicing  can  be  used  for  the  generation  of  large  libraries  of  genetically  encoded  libraries  (=106 
members),  and  in  principle  large  libraries  can  be  generated  by  randomizing  2  loops  (®109  members)  {Aim  #1, 
Task  2,  accomplished). 

We  have  also  recently  reported  the  backbone  dynamics  of  the  cyclotide  MCoTI-l  in  the  free  state  and  complexed 
to  its  binding  partner  trypsin  in  solution  [3].  This  is  the  first  time  the  backbone  dynamics  of  a  natively  folded 
cyclotide  has  been  reported  in  the  literature.  Such  insight  will  help  us  in  the  design  of  optimal  focused  libraries 
than  can  be  used  for  the  discovery  of  new  cyclotides  sequences  with  novel  biological  activities.  We  have  also 
shown  that  MCoTI-cyclotides  can  be  readily  synthesized  by  chemical  means  thus  allowing  the  introduction  of  non¬ 
natural  amino  acids.  More  importantly,  we  have  designed  a  FRET-based  fluorescence  reporter  that  can  be  used 
for  monitor  inhibition  of  the  RING-mediated  Mdm2/MdmX  interaction  inside  living  cells  using  high  throughput  cell¬ 
sorting  techniques  ( Aim  #1,  Task  1  and  milestone  #1  accomplished).  Interfacing  the  production  of  cyclotide- 
based  cell-libraries  with  in-cell  screening  methods  will  allow  the  rapid  selection  of  a  novel  type  of  ultrastable 
specific  inhibitors  for  targeting  Mdm2/MdmX  interaction. 

Next  year,  we  will  create  more  libraries  using  different  loops  of  the  cyclotide  MCoTI-l.  We  will  also  prepare  larger 
libraries  with  2  loops  randomized.  These  libraries  will  be  screened  using  fluorescence-activated  cell-sorting  using 
our  FRET-based  genetically  encoded  reporter  for  the  RING-mediated  Mdm2/MdmX  interaction.  We  will  also 
continue  studying  the  ability  of  some  cyclotides  to  cross  cellular  membranes  using  radiolabeling  and  confocal 
microscopy.  The  study  will  be  accomplished  in  several  cell  lines  including  prostate  cancer  cell  lines. 
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Cyclotides,  a  novel  natural  peptide  scaffold  for  drug  discovery 

Julio  A.  Camarero* 
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Cyclotides  are  a  new  emerging  family  of  large  plant-derived  backbone-cyclized  polypeptides  (=28-37  amino 
acids  long)  that  share  a  disulfide-stabilized  core  (3  disulfide  bonds)  characterized  by  an  unusual  knotted. 
Cyclotides  contrast  with  other  circular  poylpeptides  in  that  they  have  a  well-defined  three-dimensional 
structure,  and  despite  their  small  size,  can  be  considered  as  miniproteins.  The  main  features  of  cyclotides  are 
therefore  a  remarkable  stability  due  to  the  cystine  knot,  a  small  size  making  them  readily  accessible  to 
chemical  synthesis,  and  an  excellent  tolerance  to  sequence  variations.  For  example,  the  first  cyclotide  to  be 
discovered,  kalata  B1,  is  an  orally  effective  uterotonic,  and  other  cyclotides  have  been  shown  to  cross  the  cell 
membrane  through  macro-pinocytosis  Cyclotides  thus  appear  as  promising  leads  or  frameworks  for  peptide 
drug  design. 

We  report  for  the  first  time  the  in  vivo  biosynthesis  of  natively-folded  MCoTI-ll  inside  live  E.  coli  cells.  The 
cyclotide  MCoTI-ll  is  a  powerful  trypsin  inhibitor  recently  isolated  from  the  seeds  of  Momordica 
cochinchinensis,  a  plant  member  of  cucurbitaceae  family.  Biosynthesis  of  genetically  encoded  cyclotide-based 
libraries  opens  the  possibility  of  using  single  cells  as  microfactories  where  the  biosynthesis  and  screening  of 
particular  inhibitor  can  take  place  in  a  single  process  within  the  same  cellular  cytoplasm.  The  cyclotide  scaffold 
has  a  tremendous  potential  for  the  development  of  therapeutic  leads  based  on  their  extraordinary  stability  and 
potential  for  grafting  applications.  We  will  also  report  the  design  and  biosynthesis  of  a  MCoTI-grafted  cyclotide 
with  the  ability  to  specifically  induce  programmed  cell-death  in-cell  assays  using  different  tumor  cell  lines. 
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In  this  work  we  report  the  first  analysis  of  intracellular  uptake  of  MCoTI-l  cyclotide  in  HeLa 
cells  using  using  live  cell  imaging  by  confocal  fluorescence  microscopy.  Cyclotides  represent  a 
novel  new  platform  for  drug  development.  Their  stability,  conferred  by  the  cyclic  cystine  knot, 
their  small  size,  their  amenability  to  both  chemical  and  biological  synthesis  and  their  flexible 
tolerance  to  sequence  variation  make  them  ideal  for  grafting  of  biologically-active  therapeutic 
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Abstract 

Cyclotides  are  plant-derived  proteins  that  naturally  exhibit  various  biological  activities  and  whose 
unique  cyclic  structure  makes  them  remarkably  stable  and  resistant  to  denaturation  or  degradation. 
These  attributes,  among  others,  make  them  ideally  suited  for  use  as  drug  development  tools.  This 
study  investigated  the  cellular  uptake  of  cyclotide,  MCoTI-l  in  live  HeLa  cells.  Using  real  time 
confocal  fluorescence  microscopy  imaging  we  show  that  MCoTI-l  is  readily  internalized  in  live 
HeLa  cells  and  that  its  endocytosis  is  temperature-dependent.  Endocytosis  of  MCoTI-l  in  HeLa 
cells  is  achieved  primarily  through  fluid-phase  endocytosis,  as  evidenced  by  its  significant 
colocalization  with  lOK-dextran,  but  also  through  other  pathways  as  well,  as  evidenced  by  its 
colocalization  with  markers  for  cholesterol-dependent  and  clathrin-mediated  endocytosis,  cholera 
toxin  B  and  EGF  respectively.  Uptake  does  not  appear  to  occur  via  macropinocytosis  as  inhibition 
of  this  pathway  by  Latrunculin  B-induced  disassembly  of  actin  filaments  did  not  affect  MCoTI-l 
uptake.  As  well,  a  significant  amount  of  MCoTI-l  accumulates  in  late  endosomal  and  lysosomal 
compartments  and  MCoTI-l-containing  vesicles  continue  to  exhibit  directed  movements.  These 
findings  demonstrate  internalization  of  MCoTI-l  through  endocytic  pathways  that  are  dominant  in 
the  cell  type  investigated,  suggesting  that  this  cyclotide  has  ready  access  to  general 
endosomal/lysosomal  pathways  but  could  readily  be  re-targeted  to  specific  receptors  through 
addition  of  targeting  ligands. 


Abbreviations:  Boc,  tert-butyloxy  carbonyl;  CCK,  cyclic  cystine  knot;  CPPs,  cell-penetrating 
peptides;  CTX-B,  cholera  toxin  B;  DAST,  diethylaminosulfur  trifluoride;  DCM,  dichloromethane; 
lOK-Dex,  10,000MW-dextran;  DIEA,  di-isopropylethylamine;  DMF,  dimethyl  formamide;  EGF, 
epidermal  growth  factor;  ES-MS,  electrospray-mass  spectrometry;  Fmoc,  9-fluorenyloxy  carbonyl; 
HBTU,  2-(1H-benzotriazol-1-yl)-1,1,3,3-tetramethyluronium  hexafluorophosphate;  HPLC,  high 
performance  liquid  chromatography;  Lat  B,  latrunculin  B;  MCoTI,  Momordica  cochinchinensis 
trypsin  inhibitor;  NMP,  N-methyl-pyrrolidone;  NMR,  nuclear  magnetic  resonance;  RFP-Lampl,  Red 
Fluorescent  Protein-lysosomal  associated  protein  1;  RP-HPLC,  reverse  phase-high  performance 
liquid  chromatography;  TFA,  trifluoroacetic  acid;  TIS,  tri-isopropylsilane;  TR,  Texas  Red;  Trt,  trityl. 


Introduction 

Cyclotides  are  fascinating  micro-proteins  ranging  from  28  to  37  amino  acid  residues  that  are 
naturally  expressed  in  plants  and  exhibit  various  biological  activities  such  as  anti-microbial, 
insecticidal,  cytotoxic,  antiviral  (against  HIV),  and  protease  inhibitory  activity,  as  well  as  exert 
hormone-like  effects  [1-4].  They  share  a  unique  head-to-tail  circular  knotted  topology  of  three 
disulfide  bridges,  with  one  disulfide  penetrating  through  a  macrocycle  formed  by  the  two  other 
disulfides  and  inter-connecting  peptide  backbones,  forming  what  is  called  a  cystine  knot  topology 
(Fig.  1).  This  cyclic  cystine  knot  (CCK)  framework  gives  the  cyclotides  exceptional  resistance  to 
thermal  and  chemical  denaturation,  and  enzymatic  degradation  [4,  5].  In  fact,  the  use  of  cyclotide- 
containing  plants  in  indigenous  medicine  first  highlighted  the  fact  that  the  peptides  are  resistant  to 
boiling  and  are  apparently  orally  bioavailable  [6]. 

Cyclotides  have  been  isolated  so  far  from  plants  in  the  Rubiaceae,  Violaceae,  Cucurbitacea  [4, 
7]  and  most  recently  Fabaceae  families  [8].  Around  160  different  cyclotides  sequences  have  been 
reported  in  the  literature  [9,  10],  although  it  has  been  estimated  that  =  50,000  cyclotides  might  exist 
[11,1 2],  Despite  the  sequence  diversity  all  cyclotides  share  the  same  CCK  motif  (Fig.1 ).  Hence, 
these  micro-proteins  can  be  considered  as  natural  combinatorial  peptide  libraries  structurally 
constrained  by  the  cystine-knot  scaffold  [2]  and  head-to-tail  cyclization  but  in  which  hypermutation 
of  essentially  all  residues  is  permitted  with  the  exception  of  the  strictly  conserved  cysteines  that 

comprise  the  knot. 

Cyclotides  are  ribosomally  produced  in  plants 
from  precursors  that  comprise  between  one  and 
three  cyclotide  domains.  However,  the 
mechanism  of  excision  of  the  cyclotide  domains 
and  ligation  of  the  free  N-  and  C-termini  to  produce 
the  circular  peptides  has  not  been  completely 
elucidated  yet.  It  is  suspected,  however,  that 
specific  aspariginyl  endopeptidases  are  involved  in 
the  proteolytic  processing  and  cyclization  of  the 
precursor  proteins  [13-15].  Cyclotides  can  be  also 
produced  chemically  using  solid-phase  peptide 
synthesis  in  combination  with  native  chemical 
ligation  [16-19]  or  recombinantly  in  bacteria  by 
using  a  modified  protein  splicing  units  or  inteins 
[20,  21].  The  latter  method  can  generate  folded 
cyclotides  either  in  vivo  or  in  vitro  using  standard 
bacterial  expression  systems  [20-22]  and  opens 
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Figure  1.  Primary  and  tertiary  structures  of  MCoTI  and  kalata  cyclotides.  The  structures  of 
MCoTl-II  (pdb  ID:  1IB9  [27]  and  kalata  B1  (pdb  ID:  1NB1  [82])  are  shown.  Conserved  cysteine 
residues  and  disulfide  bonds  are  shown  in  yellow.  An  arrow  marks  residue  Lys4  located  at  loop  1  in 
MCoTI-cyclotides.  This  residue  in  MCoTI-l  was  used  for  the  site-specific  conjugation  of  AlexaFluor488 
N-hydroxysuccinimide  ester  (AF488-OSu)  through  an  stable  amide  bond. 


the  possibility  of  producing  large  libraries  of  genetically  encoded  cyclotides  which  can  be  analyzed 
by  high  throughput  cell-based  screening  for  selection  of  specific  sequences  able  to  bind  to 
particular  biomolecular  targets  [21 , 23]. 

Cyclotides  have  been  classified  into  three  main  subfamilies.  The  Mobius  and  bracelet  cyclotide 
subfamilies  differ  in  the  presence  or  absence  of  a  c/s- Pro  residue,  which  introduces  a  twist  in  the 
circular  backbone  topology  [24].  A  third  subfamily  comprises  the  cyclic  trypsin  inhibitors  MCoTI-l/ll 
(Fig.  1),  which  have  been  recently  isolated  from  the  dormant  seeds  of  Momordica  cochinchinensis , 
a  plant  member  of  the  cucurbitaceae  family,  and  are  powerful  trypsin  inhibitors  (K)==  20  -  30  pM) 
[25].  These  cyclotides  do  not  share  significant  sequence  homology  with  other  cyclotides  beyond 
the  presence  of  the  three-cystine  bridges,  but  structural  analysis  by  NMR  has  shown  that  they 


adopt  a  similar  backbone-cyclic  cystine-knot  topology  [26,  27],  MCoTI  cyclotides,  however,  show 
high  sequence  homology  with  related  linear  cystine-knot  squash  trypsin  inhibitors  [25],  and 
therefore  represent  interesting  molecular  scaffolds  for  drug  design  [19,  28-30].  Indeed,  acyclic 
squash  inhibitors  have  been  already  used  as  scaffold  for  the  incorporation  of  novel  bioactive 
peptides  to  render  de-novo  engineered  knottins  with  novel  biological  activities  [31 , 32], 

All  these  features  make  cyclotides  ideal  drug  development  tools  [19,  28-30].  They  are 
remarkably  stable  due  to  the  cyclic  cystine  knot  [33].  They  are  relatively  small,  making  them 
readily  accessible  to  chemical  synthesis  [16].  They  can  also  be  encoded  within  standard  cloning 
vectors,  and  expressed  in  cells  [20-22],  and  are  amenable  to  substantial  sequence  variation  [34], 
which  make  them  ideal  substrates  for  molecular  grafting  of  biological  peptide  epitopes  [4]  or 
amenable  to  molecular  evolution  strategies  to  enable  generation  and  selection  of  compounds  with 
optimal  binding  and  inhibitory  characteristics  [22,  34],  Even  more  importantly,  MCoTI-cyclotides 
have  been  shown  recently  to  be  able  to  enter  human  macrophages  and  breast  cancer  cell  lines 
[35].  Internalization  into  macrophages  was  shown  to  be  mediated  mainly  through 
macropinocytosis,  a  form  of  endocytosis  that  is  actin-mediated  and  results  in  formation  of  large 
vesicles  termed  macropinosomes  [36,  37],  It  should  be  noted,  however,  that  in  this  study  the 
visualization  of  MCoTI-ll  uptake  was  done  in  fixed  and  not  in  live  cells.  Analysis  of  live  cells 
provides  the  ability  to  visualize  events  in  real  time  without  the  possible  complications  of  fixation 
artifacts  that  have  confounded  interpretations  of  the  uptake  of  Tat  and  other  related  peptides  for 
instance  [38,  39].  As  well,  macropinocytosis  is  a  dominant  mechanism  for  endocytic  uptake  in 
macrophages  [40-42],  unlike  other  cells  that  are  not  specialized  for  large  scale  sampling  of 
extracellular  fluid  and  which  use  multiple  alternative  endocytic  mechanisms.  These  mechanisms 
can  include  clathrin-mediated  endocytosis,  caveolar  endocytosis,  macropinocytosis,  phagocytosis, 
flotillin-dependent  endocytosis,  as  well  as  multiple  other  as  yet  under-characterized  mechanisms 
[43,  44],  Intrigued  by  these  results,  we  explored  the  cellular  uptake  of  site-specific  fluorescent- 
labeled  MCoTI-cyclotides  and  studied  the  cellular  uptake  mechanisms  in  HeLa  cells  using  live  cell 
imaging  by  confocal  fluorescence  microscopy. 

In  this  work  we  report  for  the  first  time  the  cellular  uptake  of  MCoTI-cyclotides  monitored  by  real 
time  confocal  fluorescence  microscopy  imaging  in  live  HeLa  cells.  Our  results  clearly  show  that 
HeLa  cells  readily  internalize  fluorescently-labeled-MCoTI-l.  We  found  that  this  process  is 
temperature-dependent  and  can  be  reversibly  inhibited  at  4°C,  which  indicates  an  active 
mechanism  of  internalization.  The  internalized  cyclotide  also  seems  to  colocalize  in  live  cells  with 
multiple  endocytic  markers  including,  to  the  greatest  extent,  the  fluid-phase  endocytic  marker 
dextran  (10  KDa  dextran,10K-Dex).  Internalized  MCoTI-l  was  colocalized  to  a  lesser  extent  with 
the  cholesterol/lipid  dependent  endocytic  marker  cholera  toxin  B  (CTX-B)  and  the  clathrin- 
mediated  endocytic  marker,  EGF.  Internalized  MCoTI-l  was  localized  within  a  fairly  rapid  time 
course  with  late  endosomal  and  lysosomal  compartments  which  engaged  in  rapid  and  directed 
movements  suggestive  of  cytoskeletal  involvement.  MCoTI-l  uptake  in  HeLa  cells  was  not 
impaired  by  Latrunculin  B  (Lat  B),  a  well-known  inhibitor  of  macropinocytosis.  Altogether,  these 
data  seem  to  indicate  that  MCoTI-l  cyclotide  is  capable  of  internalization  in  live  cells  through 
multiple  endocytic  pathways  that  may  be  dominant  in  the  particular  cell  type  under  study.  The  lack 
of  strong  preference  for  MCoTI-l  internalization  via  a  specific  cellular  internalization  pathway  is  of 
significant  value  since  the  lack  of  endogenous  affinity  for  a  particular  pathway  can  enable  the  ready 
re-targeting  by  introduction  of  targeting  peptides  within  the  scaffold  that  may  enable  specific  and 
targeted  endocytic  uptake  to  a  particular  target  cell.  At  the  same  time,  the  ready  uptake  of  MCoTI-l 
by  multiple  pathways  suggests  accessibility,  in  the  untargeted  form,  to  essentially  all  cells. 

Materials  and  Methods 

Analytical  characterization  of  cyclotides:  Analytical  HPLC  was  performed  on  a  HP1 100  series 
instrument  with  220  and  280  nm  detection  using  a  Vydac  Cl  8  column  (5  micron,  4.6  x  150  mm)  at 
a  flow  rate  of  1  mL/min.  Preparative  and  semi-preparative  HPLC  were  performed  on  a  Waters 


Delta  Prep  system  fitted  with  a  Waters  2487  UV-visible  detector  using  a  Vydac  Cl  8  (15-20  pm,  10 
x  250  mm)  at  a  flow  rate  of  5  mL/min.  All  runs  used  linear  gradients  of  0.1%  aqueous  trifluoroacetic 
acid  (TFA,  solvent  A)  vs.  0.1%  TFA,  90%  acetonitrile  in  H20  (solvent  B).  Ultraviolet-visible  (UV-vis) 
spectroscopy  was  carried  out  on  an  Agilent  8453  diode  array  spectrophotometer.  Electrospray 
mass  spectrometry  (ES-MS)  analysis  was  routinely  applied  to  all  compounds  and  components  of 
reaction  mixtures.  ES-MS  was  performed  on  an  Applied  Biosystems  API  3000  triple  quadruple 
electrospray  mass  spectrometer  using  Analyst  1.4.2.  Calculated  masses  were  obtained  using 
Analyst  1 .4.2.  All  chemicals  involved  in  synthesis  or  analysis  were  obtained  from  Aldrich 
(Milwaukee,  Wl)  or  Novabiochem  (San  Diego,  CA)  unless  otherwise  indicated. 

Preparation  of  Fmoc-Tyr(tBu)-F.  Fmoc-Tyr(tBu)-F  was  prepared  using  diethylaminosulfur 
trifluoride  DAST  [45]  and  quickly  used  afterwards.  Briefly,  to  a  stirred  solution  of  Fmoc-Tyr(tBu)-OFi 
(459.6  mg,  1  mmol)  in  10  mL  of  dry  dichloromethane  (DCM),  containing  dry  pyridine  (800  pL,  1 
mmol)  and  (1.1  mL,  1.2  mmol)  of  DAST  was  added  dropwise  at  25°  C  under  nitrogen  current.  After 
20  minutes,  the  mixture  was  washed  with  ice-cold  water  (3  x  20  mL).  The  organic  layer  was 
separated  and  dried  over  anhydrous  MgS04.  The  solvent  was  removed  under  reduced  pressure  to 
give  the  corresponding  Fmoc-amino  acyl  fluoride  as  white  solid  that  was  used  immediately.  Amino 
acid  fluorides  should  be  used  immediately  as  they  are  extremely  unstable  and  prone  to  hydrolysis. 

Loading  of  4-sulfamylbutyryl  AM  resin  with  Fmoc-Tyr(tBu)-F.  Loading  of  the  first  residue  was 
accomplished  using  Fmoc-Tyr(tBu)-F  according  to  standard  protocols  [46].  Briefly,  4- 
Sulfamylbutyryl  AM  resin  (420mg,  0.33  mmol)  (Novabiochem)  was  swollen  for  20  minutes  with  dry 
DCM  and  then  drained.  A  solution  of  Fmoc-Tyr(tBu)-  jH  nrnm  annu  □  nm^Hmnnmi  rifn~rn 
di-isopropylethylamine  (DIEA)  (180  pL,  1  mmol)  was  added  to  the  drained  resin  and  reacted  at  25° 
C  for  1  h.  The  resin  was  washed  with  dry  DCM  (5x5  mL),  dried  and  kept  at  rt20°C  until  use. 

Chemical  synthesis  of  MCoTI-l.  Solid-phase  synthesis  was  carried  out  on  an  automatic  peptide 
synthesizer  ABI433A  (Applied  Biosystems)  using  the  Fast-Fmoc  chemistry  with  2-(1  H-benzotriazol- 
1-yl)-1 ,1 ,3,3-tetramethyluronium  hexafluorophosphate  (HBTU)  activation  protocol  at  0.1  mmole 
scale  on  a  Fmoc-Tyr(tBu)-sulfamylbutyryl  AM  resin.  Side-chain  protection  was  employed  as 
previously  described  for  the  synthesis  of  peptide  a-thiesters  by  the  Fmoc-protocol  [47],  except  for 
the  N-terminal  Cys  residue,  which  was  introduced  as  Boc-Cys(Trt)-OFI.  After  chain  assembly,  the 
alkylation,  thiolytic  cleavage  and  deprotection  were  performed  as  previously  described  [48,  49], 
□FUmM  nnnm  nm^Fn(nr(nnmnn(^nm  cRi%F  (ro4 Dtp  □  @=  Xn^r2CN  (1 74  pL,  2.4 
mmol;  previously  filtered  through  basic  silica)  and  DIEA  (82  pL,  0.46  mmol)  in  N-methylpyrrolidone 
(NMP)  (2.2  mL)  for  12  h.  The  resin  was  then  washed  with  NMP  (3x5  mL)  and  DCM  (3x5  mL). 
The  alkylated  peptide  resin  was  cleaved  with  FiSCH2CFI2C02Et  (200  pL,  1.8  mmol)  in  the  presence 
of  a  catalytic  amount  of  sodium  thiophenolate  (NaSPh,  3  mg,  22  pmol)  in  dimethylformamide 
(DMF):DCM  (3:4  v/v,  1 .4  mL)  for  24  h.  The  resin  was  then  dried  at  reduced  pressure.  The  side- 
chain  protecting  groups  were  removed  by  treating  the  dried  resin  with  trifluoroacetic  acid 
(TFA):H20:tri-isopropylsilane  (TIS)  (95:3:2  v/v,  5  mL)  for  3-4  h  at  room  temperature.  The  resin  was 
filtered  and  the  linear  peptide  thioester  was  precipitated  in  cold  Et20.  The  crude  material  was 
dissolved  in  the  minimal  amount  of  H20:MeCN  (4:1)  containing  0.1%  TFA  and  characterized  by 
HPLC  and  ES-MS  as  the  desired  MCoTI-l  linear  precursor  a-thioester  [Expected  mass  (average 
isotopic  composition)  =  3608.2  Da;  measured  =  3608.8  ±  0.3  Da].  Cyclization  and  folding  was 
accomplished  by  flash  dilution  of  the  MCoTI-l  linear  a-thioester  TFA  crude  to  a  final  concentration 
~i~m  ansa  rrmpmn  n4^inn  az2  mM  reduced  glutathione  (GSH),  50  sodium  phosphate  buffer  at 
pH  7.5  for  18  h.  Folded  MCoTI-l  was  purified  by  semi-preparative  HPLC  using  a  linear  gradient  of 
10-35%  solvent  B  over  30  min.  Pure  MCoTI-l  was  characterized  by  HPLC  and  ES-MS  [Expected 
mass  (average  isotopic  composition)  =  3480.9  Da;  measured  =  3481.0  ±  0.4  Da]. 


Recombinant  Expression  of  MCoTI-l.  Bacterial  expression  and  purification  of  MCoTI-l  was 
carried  out  reviously  described  [22], 

Chemical  labeling  of  MCoTI  with  AlexaFluor488  succinimide  ester  (AF488-NHS).  MCoTI-l  was 
site-specifically  labeled  through  the  s-amino  of  residue  Lys4  (Fig.  1 ).  MCoTI-l  only  has  one  Lys 
residue  in  its  sequence  (Fig.  1).  Briefly,  MCoTI-l  (5  mg,  1.4  pmol)  was  conjugated  with  two-fold 
molar  excess  of  AF488-MHS  in  0.2  M  sodium  phosphate  buffer  (2.5  mL)  at  pH  7.5  for  2  h.  The 
reaction  was  quenched  with  6  mM  NFI2-OFI  solution  at  pH  4.  AF488-labeled  MCoTI-l  was  purified 
by  semi-preparative  FIPLC  using  a  linear  gradient  of  15-35%  solvent  B  over  30  min.  Pure  labeled 
MCoTI-l  was  characterized  by  FIPLC  and  ES-MS  [Expected  mass  (average  isotopic  composition)  = 
3997.9  Da;  measured  =  3997.4  ±  0.3  Da]. 

Purification  of  synthetic  MCoTI-l  using  trypsin-Sepharose  beads.  Preparation  of  trypsin- 
Sepharose  beads  was  done  as  previously  described  [21-23].  Pull  down  experiments  with  synthetic 
MCoTI-l  were  performed  as  follows:  Synthetic  MCoTI-l  cyclization/folding  crude  reactions  were 
typically  incubated  with  0.2  mL  of  trypsin-Sepharose  for  one  hour  at  room  temperature  with  gentle 
rocking,  and  centrifuged  at  3000  rpm  for  1  min.  The  beads  were  washed  with  50  volumes  of  PBS 
containing  0.1%  Triton  X-100,  then  rinsed  with  50  volumes  of  PBS,  and  drained  of  excess  PBS. 
Bound  MCoTI-l  was  eluted  with  0.4  mL  of  8  M  GdmCI  and  tractions  were  analyzed  by  RP-FIPLC 
and  ES-MS. 

Endocytosis  Experiments.  For  studies  of  endocytic  uptake  mechanisms,  methyl-F-cyclodextrin 
(MBCD)  was  purchased  from  Sigma-Aldrich.  Latrunculin  B  (Lat  B)  was  purchased  from 
Calbiochem  .  Texas  Red-EGF  (TR-nn  nirrpmimrrrTE  i%nc%N-99  (LysoRed),  AF594  cholera 
toxin  B  (AF594  CTX-B),  Texas  Red  10,000  MW  dextran  (TR- rnn nr.  nmirTTr  rnmrmn~E  Lysosomes- 
RFP  (RFP-Lampl),  rhodamine-phalloidin,  and  DAPI  were  all  purchased  from  Invitrogen  (Carlsbad, 
CA). 

Cell  Culture.  PleLa  cells  were  obtained  from  the  American  Type  Culture  Collection  (ATCC)  and 
were  cultured  in  a  humidified  incubator  at  37°C  in  95%  air/5%  C02  in  phenol  red- mnm nmrmCSn 
modified  essential  medium  (DMEM)(4.5  g/L  glucose  with  10%  FBS,  1%  glutamine,  and  1%  non- 
essential  amino  acids)  and  split  with  trypsin/EDTA  as  recommended  by  the  manufacturer. 

Confocal  Fluorescence  Microscopy.  For  MCoTI-l  uptake  studies,  PleLa  cells  were  seeded  on  35 
mm  glass-bottom  culture  dishes  (MatTek,  Ashland,  MA)  at  a  density  of  8.5  X  104  cells/dish.  On  day 
2  of  culture,  the  cells  were  rinsed  with  PBS  and  the  media  replaced  with  incubation  buffer  (phenol 
red-free,  serum-free  DMEM  with  1%  P/S  and  20  mM  FIEPES)  prior  to  addition  of  AF488-MCoTI-l 
(25  pM)  and  incubation  at  37 °C  for  1  hr.  Following  this  time,  excess  MCoTI-l  was  rinsed  off  with  a 
gentle  PBS  wash  and  the  media  replaced  prior  to  imaging.  Intracellular  distribution  was  analyzed  at 
1  hr  and  again  at  regular  intervals  for  up  to  10  hours.  For  assessment  of  distribution  and 
colocalization  of  AF488-MCoTI-l  and  LysoTracker™  Red,  RFP-Lampl,  AF594-Cholera  toxin  B,  TR- 
10K  Dex,  orTR-EGF  in  live  cells,  we  utilized  a  Zeiss  LSM  510  Meta  NLO  imaging  system  equipped 
with  Argon  and  HeNe  lasers  and  mounted  on  a  vibration-free  table  for  confocal  fluorescence 
microscopy.  For  analysis  of  the  effects  of  Lat  B  pre-treatment,  2  pM  Lat  B  was  added  for  30  min  at 
37 °C  prior  to  addition  of  AF488-MCoTI-l.  For  colocalization  studies,  LysoTracker™  Red  (50  nM), 
AF594-Cholera  toxin  B  (10  pg/mL),  TR-10K  Dex  (1  mg/mL),  orTR-EGF  (400  ng/mL)  was  added  to 
cells  simultaneously  with  MCoTI-l  prior  to  incubation  at  37°C.  Analysis  of  the  extent  of 
colocalization  was  done  at  1  hr  of  uptake.  For  colocalization  with  RFP-Lampl,  cells  were  treated 
with  RFP-Lampl -expressing  BacMam  (2  x  107  particles/plate)  on  the  previous  day.  For 
temperature-dependent  uptake  studies,  cells  were  cooled  on  ice  for  30  min  prior  to  the  addition  of 
AF488-MCoTI-l  in  incubation  buffer.  After  incubation  at  4°C  for  30  min,  the  cells  were  imaged  and 
subsequently  incubated  at  37 °C  for  1  hour  before  imaging  again.  For  fixation  and  visualization  of 


actin  filaments  following  treatment  with  or  without  2  gM  Lat  B,  the  cells  were  fixed  with  4% 
paraformaldehyde  prior  to  the  addition  of  rhodamine-phalloidin  and  DAPI.  For  analysis  of 
fluorescent  pixel  colocalization,  cells  from  at  least  3  different  experiments  were  analyzed 
individually.  Using  the  Zeiss  LSM  510  software  colocalization  tool,  regions  of  interest  (ROI)  were 
selected  and  marked  with  an  overlay  to  encompass  all  pixels,  following  the  Zeiss  manual  protocol. 
The  threshold  was  automatically  set  from  these  ROIs.  For  time-lapse  imaging,  cells  were  incubated 
with  25  pM  AF488-MCoTI-l  for  1  h  at  37°C.  Following  this  time,  excess  MCoTI-l  was  rinsed  off  with 
a  gentle  PBS  wash  and  the  media  replaced  prior  to  imaging.  The  time  series  image  capture  was 
set  to  a  2.5  second  delay  between  scans. 

Results  and  Discussion 

In  order  to  study  the  cellular  uptake  of  MCoTI-cyclotides,  we  decided  to  use  MCoTI-l.  MCoTI-l 
contains  only  one  Lys  residue  located  in  loop  1  versus  MCoTI-ll,  which  contains  three  Lys  residues 
in  the  same  loop  (Fig.  1).  The  presence  of  only  one  Lys  residue  facilitates  the  site-specific 

introduction  of  a  unique  fluorophore  on  the  sequence 
thus  minimizing  any  affect  that  the  introduction  of  this 
group  may  have  on  the  cellular  uptake  properties  of 
the  cyclotide. 

Folded  MCoTI-l  cyclotide  was  produced  either  by 
recombinant  or  synthetic  methods.  In  both  cases  the 
backbone  cyclization  was  performed  by  an 
intramolecular  native  chemical  ligation  (NCL)  [50-53] 
using  the  native  Cys  located  to  the  beginning  of  loop 
6  to  facilitate  the  cyclization.  This  ligation  site  has 
been  shown  to  give  very  good  cyclization  yields  [21 , 
22],  Intramolecular  NCL  requires  the  presence  of  an 
N-terminal  Cys  residue  and  C-terminal  a-thioester 
group  in  the  same  linear  precursor  [52,  54],  In  the 
biosynthetic  approach,  the  MCoTI-l  linear  precursor 
was  fused  in  frame  at  their  C-  and  N-terminus  to  a 
modified  Mxe  Gyrase  A  intein  and  a  Met  residue, 
respectively  and  expressed  in  Escherichia  coli  [23]. 
This  allows  the  generation  of  the  required  C-terminal 
thioester  and  N-terminal  Cys  residue  after  in  vivo 
processing  by  endogenous  Met  aminopeptidase 
(MAP)  [20,  55].  Cyclization  and  folding  can  be 


Figure  2.  Chemical  synthesis  of  MCoTI-l  (A)  Synthetic  scheme  used  for  the  chemical  synthesis  of 
cyclotide  MCoTI-l  by  Fmoc-based  solid-phase  peptide  synthesis  (B)  Analytical  reverse-phase  HPLC 
traces  of  MCoTI-l  linear  precursor  a-thioester,  cyclization/folding  crude  and  purified  MCoTI-l  by  either 
affinity  chromatograpgy  using  trypsin-immobilized  Sepharose  beads  or  semipreparative  reverse-phase 
HPLC.  HPLC  analysis  was  performed  in  all  the  cases  using  a  linear  gradient  of  0%  to  70%  buffer  B  over 
30  min.  Detection  was  carried  out  at  220  nm.  An  arrow  indicated  the  desired  product  in  each  case. 

accomplished  very  efficiently  in  vitro  by  incubating  the  MCoTI-l  intein  fusion  construct  in  sodium 
phosphate  buffer  at  pH  7.4  in  the  presence  of  reduced  glutathione  (GSH).  Biosynthetic  MCoTI- 
cyclotides  generated  this  way  have  been  shown  to  adopt  a  native  folded  structure  by  NMR  and 
trypsin  inhibitory  assays  [20,  22,  33]. 

Natively  folded  MCoTI-ll  has  been  already  successfully  produced  using  Fmoc-based  solid-phase 
peptide  synthesis  [18,  19].  Encouraged  by  these  results  we  also  explored  the  production  of  MCoTI- 
I  by  chemical  synthesis  (Fig.  2).  For  this  purpose  the  MCoTI-l  linear  precursor  a-thioester  was 
assembled  by  Fmoc-based  solid-phase  peptide  synthesis  on  a  sulfonamide  resin  [48,  49]  (Fig.  2A). 
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Activation  of  the  sulfonamide  linker  with  iodoacetonitrile  followed  by  cleavage  with  ethyl 
mercaptoacetate  and  acidolytic  deprotection  with  TFA  provided  the  fully  protected  linear  peptide  a- 
thioester  (Fig.  2B).  The  synthetic  linear  precursor  thioester  was  then  efficiently  cyclized  and  folded 
in  one-pot  reaction  using  sodium  phosphate  buffer  at  pH  7.5  in  the  presence  of  2  mM  GSFI.  The 
reaction  was  complete  in  18  h  and  the  folded  product  was  purified  by  reverse-phase  FIPLC  and 
characterized  by  ES-MS.  The  expected  mass  for  folded  MCoTI-l  was  in  agreement  with  a  folded 
structure  (Expected  mass  =  3480.9  Da;  measured  =  3481.0  ±  0.4  Da).  Synthetic  folded  MCoTI-ll 
was  also  shown  to  co-elute  by  FIPLC  with  recombinant  natively  folded  MCoTI-l  (data  not  shown). 
The  biological  activity  of  synthetic  MCoTI-l  was  assayed  by  using  a  trypsin  pull-down  experiment 

[22,  23].  As  shown  in  Figure  2B, 
synthetic  folded  MCoTI-l  was 
specifically  captured  from  a 
cyclization/folding  crude  reaction  by 
trypsin-immobilized  Sepharose  beads 
[21-23],  thus  indicating  that  was 
adopting  a  native  cyclotide  fold. 
Purified  MCoTI-l  was  site-specifically 
labeled  with  AlexaFluor  488  (AF488) 
for  live  confocal  imaging.  The  e-amino 
group  of  Lys4  residue  located  in  loop 

1  was  conjugated  to  AF488-NPIS  in 
sodium  phosphate  buffer  at  pH  7.5  for 

2  h  (Fig.  3A).  Under  these  conditions 
the  main  product  of  the  reaction  was 
mono-labeled  AF488-MCoTI  as 
characterized  by  FIPLC  and  ES-MS 
(expected  average  mass  =  3997.9 

Figure  3.  Site-specific  labeling  of  MCoTI-l  with  AlexaFluor-488  N-hydroxysuccinimide  ester 
(AF488-OSu).  (A)  Scheme  depicting  the  bioconjugation  process  and  localization  of  the  fluorescent 
probe  at  residue  Lys4  in  loop  1.  (B)  Analytical  reverse-phase  HPLC  trace  of  pure  AF488-MCoTI-l.  HPLC 
analysis  was  performed  using  a  linear  gradient  of  0%  to  70%  buffer  B  over  30  min.  Detection  was 
carried  at  220  nm  (C)  ES-MS  spectra  of  pure  AF488-MCoTI-l.  H 

Da;  measured  =  3997.4  ±  0.3  Da)  (Figs  3C).  AF488-labeled  MCoTI-l  was  then  purified  by  reverse- 
phase  FIPLC  to  remove  any  trace  of  unreacted  materials  (Fig.  3B). 

In  order  to  infer  the  correct  conclusions  regarding  data  obtained  on  the  cellular  uptake  of  native 
MCoTI-l  when  using  modified  cyclotides,  like  AF488-MCoTI-l  for  example,  it  is  critical  to  be  sure 
that  they  still  adopt  structures  similar  to  that  of  the  native  form.  MCoTI-cyclotides  are  extremely 
stable  to  chemical  and  thermal  denaturation,  and  they  have  been  shown  to  be  able  to  withstand 
procedures  like  reverse-phase  chromatography  in  the  presence  of  organic  solvents  under  acidic 
conditions  without  affecting  their  tertiary  structure  [1 6,  1 8-21 , 33].  It  is  also  unlikely  that  the 
acylation  of  the  s-amino  group  of  Lys4  in  MCoTI-l  may  disrupt  the  tertiary  structure  of  this  cyclotide. 
Craik  and  co-workers  have  previously  shown  that  biotinylation  of  the  three  Lys  residues  located  in 
loop  1  in  MCoTI-ll,  including  Lys4  (Fig.  1)  does  not  disrupt  the  native  cyclotide  fold  of  this  cyclotide 
as  determined  by  ^-NMR  [35].  We  have  also  recently  shown  that  mutation  of  residue  Lys4  by  Ala 
does  not  seem  to  affect  the  ability  of  this  mutant  to  adopt  a  native  cyclotide  fold,  thus  indicating  that 
the  presence  of  positive  charge  residue  in  this  position  is  not  critical  for  the  tertiary  structure  of 
MCoTI-l  [22].  Similar  findings  have  been  also  found  by  Leatherbarrow  and  coworkers,  where 
mutation  of  this  residue  by  Phe  or  Val  was  still  able  to  render  MCoTI-cyclotides  able  to  fold 
correctly  and  have  inhibitory  activity  against  chymotrypsin  and  human  elastase,  respectively  [19]. 
Altogether  these  facts  suggest  that  residue  Lys4  is  not  critical  for  adopting  the  native  cyclotide  fold 
or  disturbing  the  tertiary  structure  of  MCoTI-cyclotides. 
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To  study  the  cellular  uptake  of  AF488-MCoTI-l  we 
used  HeLa  cells.  The  internalization  studies  were  all 
carried  out  with  25  pM  AF488-MCoTI-l.  This 
concentration  provided  a  good  signal/noise  ratio  for 
live  cell  confocal  fluorescence  microscopy  studies 
and  did  not  show  any  cytotoxic  effect  on  FieLa  cells. 
This  is  in  agreement  with  the  cellular  tolerance  of 
wild-type  and  biotinylated  MCoTI-ll  reported  for 
other  types  of  human  cell  lines  [35].  First,  we 
analyzed  the  time  course  of  changes  in  cellular 
distribution  following  uptake  of  25  pM  AF488- 
MCoTI-l  by  incubating  with  the  cyclotide  for  1  hr  and 
then  analyzing  its  distribution  after  1, 2,  4,  8  and  10 
h.  As  shown  in  Figure  4,  the  internalized  cyclotide 
was  clearly  visible  within  perinuclear  punctate  spots 
inside  the  cells  after  1  h  incubation.  Observation  of 
cells  pulsed  with  AF488-MCoTI-l  for  one  hour  and 
then  incubated  for  longer  periods  of  time  in  the 
absence  of  cyclotide  did  not  show  any  evidence  for 
decreased  intracellular  fluorescence,  while  the 
largely  perinuclear  distribution  of  internalized 
MCoTI-l  appeared  comparable  at  all  time  points. 
Similar  results  have  been  also  been  recently 
reported  on  the  internalization  of  biotinylated 
MCoTI-ll  in  macrophage  and  breast  cancer  cell 
lines  [35],  these  studies  however,  used  fixed  cells  to 
visualize  the  internalized  cyclotide. 

In  order  to  study  the  mechanism  of  internalization 
of  AF488-MCoTI-l  in  live  FieLa  cells,  we  first 
explored  the  effect  of  temperature  on  the  uptake 
process.  Active  and  energy-dependent  endocytic 
mechanisms  of  internalization  are  inhibited  at  4°C 
[56].  The  internalization  of  AF488-MCoTI-l  was 
totally  inhibited  after  a  1  h  incubation  at  4T3  (Fig.  5). 
This  inhibition  was  completely  reversible  and  when 
the  same  cells  were  incubated  again  at  37 °C  for  1 
h,  the  punctate  intracellular  fluorescence  labeling 
pattern  was  restored.  This  result  confirmed  that  the 
uptake  of  AF488-MCoTI-l  in  FieLa  cells  follows  a 


Figure  4.  MCoTI-l  distribution  in  HeLa  cells.  HeLa  cells  were  incubated  with  25  pM  MCoTI-l  for  1 
hour,  cyclotide  was  removed  with  gentle  rinsing  in  PBS  and  then  the  cells  were  monitored  for  distribution 
of  intracellular  fluorescence  at  intervals  from  1-10  hours  using  confocal  fluorescence  microscopy.  Bar  = 
10pm. 

temperature  dependent  active  endocytic  internalization  pathway.  It  should  be  noted  that  no 
significant  surface  binding  was  detected  at  4°C,  suggesting  that  MCoTI-l  does  not  bind  a  surface 
receptor,  even  nonspecifically.  This  is  in  agreement  with  studies  on  the  MCoTI-ll  in  fixed  cells  so 
both  the  MCoTI-l  and  MCoTI-ll  appear  to  lack  specific  affinity  for  proteins  or  lipids  in  cell 
membranes,  unlike  the  kalata  B1  cyclotide  which  shows  membrane  affinity  [35].  This  lack  of 
endogenous  affinity  for  a  specific  surface  receptor  or  membrane  constituent  makes  MCoTI-l  ideal 
for  engineering  using  more  specific,  receptor-directed,  peptide-based,  internalization  motifs,  within 


the  scaffold,  that  might  enable  members  of  this  family  to  have  targeting  enhanced  to  a  specific  cell 

type. 

Next,  we  investigated  the  internalization  pathway  used 
by  labeled-MCoTI-l  to  enter  HeLa  cells.  There  are  several 
known  and  well-characterized  mechanisms  of  endocytosis 
[57],  It  is  also  now  well  established  that  almost  all  cell- 
penetrating  peptides  (CPPs)  use  a  combination  of 
different  endocytic  pathways  rather  than  a  single 
endocytic  mechanism  [57],  A  recent  study  showed  that 
several  CPPs  (including  Antennapedia/penetratin,  nona- 
Arg  and  Tat  peptides)  can  be  internalized  into  cells  by 
multiple  endocytic  pathways  including  macropinocytosis, 
clathrin-mediated  endocytosis,  and  caveolae/lipid  raft 
mediated  endocytosis  [58].  To  investigate  if  that  was  the 
case  with  the  internalization  of  AF488-MCoTI-l  in  HeLa 


Figure  5.  Endocytosis  of  MCoTI-l  is  temperature-dependent.  HeLa  cells  were  incubated  with  25 
pM  MCoTI-l  for  1  hr  at  4°C.  After  removal  of  the  MCoTI-l-containing  media,  and  a  gentle  PBS  wash, 
the  cells  were  imaged.  Following  imaging,  the  MCoTI-l-containing  media  was  replaced  and  the  cells 
incubated  at  37qC  for  1  hr  and  imaged  again.  Bar  =  10  pm. 


Figure  6.  Colocalization  of  MCoTI-l  with  markers  of 
endocytosis.  (A)  HeLa  cells  were  incubated  with  25 
pM  MCoTI-l  and  either  1  mg/ml  lOK-dextran  (10K- 
Dex),  10  pg/ml  cholera  toxin  B  (CTX-B),  or  400  ng/ml 
epidermal  growth  factor  (EGF)  for  1  hour  at  37°C  as 
described  in  Materials  and  Methods  and  then  imaged. 

Bar  =  10  pm.  (B)  Quantification  of  pixel  colocalization 
was  done  using  the  Zeiss  LSM  software  for  image 
analysis  and  measures  the  %  of  total  fluorescent 
AF488  MCoTI-l  pixels  in  the  ROI  relative  to  red  pixels 
associated  with  different  endocytic  markers,  (n  =  13 
cells  for  1 0K-Dex,  n  =  11  cells  for  CTX-B  and  n  =  10 
cells  for  EGF,  with  cells  assessed  across  3  different 
experiments,  *  p  <  0.05  relative  to  1 0K-Dex,  #  p  <  0.05 
relative  to  CTX-B). 

cells,  we  decided  to  look  at  its  colocalization  with 
various  endocytic  markers  (Fig.  6).  lOK-Dex  has 
previously  been  used  as  a  marker  of  fluid-phase 
endocytosis  [36,  59-61].  CTX-B  has  been  used  as  a 
marker  for  various  lipid-dependent  endocytic  pathways 
[44,  62],  while  EGF  has  traditionally  been  a  marker  of 
clathrin-mediated  endocytosis  [63-65].  As  shown  in 
Figure  6,  colocalization  studies  showed  that  after  1  h, 
AF488-MCoTI-l  fluorescence  was  significantly 
colocalized  with  the  fluorescence  associated  with  10K- 
Dex  (59  ±  4%  of  total  cyclotide  fluorescence  pixels  were 
colocalized  with  lOK-Dex  fluorescent  pixels).  Less 
colocalization  was  observed  with  fluorescent  CTX-B  (39  ±  4  %)  and  fluorescent  EGF  (21  ±  2  %). 
This  data  seems  to  suggest  that  AF488-MCoTI-l  is  primarily  entering  cells  through  fluid-phase 
endocytosis.  The  observed  traces  of  colocalization  with  CTX-B  and  EGF  also  suggest  that  AF488- 
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MCoTI-l  could  be  using  alternative  or  additional  endocytic  pathways.  The  colocalization  results 
could  also  be  attributed,  however,  to  the  merging  of  endosomal  uptake  vesicles  generated  by 
different  pathways  at  the  level  of  an  early  endosome.  To  address  whether  the  major  uptake  and 
colocalization  of  AF488-MCoTI-l  with  lOK-Dex  was  due  to  cointernalization  by  macropinocytosis, 
we  explored  the  inhibition  of  AF488-MCoTI-l  uptake  by  Lat  B,  a  potent  inhibitor  of  actin 
polymerization,  which  is  an  essential  element  of  macropinocytlsis  [66-69].  As  shown  in  Figure  7, 
Lat  B  did  not  significantly  inhibit  uptake  of  AF488-MCoTI-l  (Fig.  7A)  nor  of  lOK-Dex  (data  not 
shown).  Treatment  of  FleLa  cells  with  this  agent  resulted  in  a  total  disruption  of  the  actin  filament 
network  (Fig.  7B).  These  data  suggest  that  macropinocytosis  is  not  responsible  for  uptake  of 
either  lOK-Dex  nor  AF488-MCoTI-l  in  FleLa  cells. 

As  an  extension  of  these  inhibition  studies,  cells 
were  also  treated  with  MBCD,  a  well-established 
cholesterol-depleting  agent  employed  for  studying  the 
involvement  of  lipid  rafts/cave o I ae  in  endocytosis  [70,  71]. 
Preliminary  studies  with  MBCD  suggested  no  significant 
inhibition  of  AF488-MCoTI-l  (data  not  shown).  Since  the 
extent  of  total  colocalization  of  AF488-MCoTI-l  with  CTX-B 
was  less  than  40%,  it  is  unsurprising  that  no  marked  effect 
was  seen  by  live  cell  microscopy.  Taken  together,  these 
results  seem  to  suggest  that  the  uptake  of  AF488-MCoTI-l 
in  FleLa  cells  is  following  multiple  endocytic  pathways, 
which  is  in  agreement  with  what  has  been  recently 
reported  for  different  CPPs  [58]. 

Next  we  explored  the  fate  of  the  endocytic  vesicles 
containing  labeled  MCoTI-l.  There  are  at  least  two 
pathways  that  involve  the  cellular  trafficking  of  endosomal 
vesicles.  The  degradative  pathway  includes  routing  of 
internalized  materials  from  early  endosomes  via  late 
endosomes  to  lysosomes  where  degradation  of 
internalized  materials  occurs  within  the  cells.  On  the  other 
hand,  recycling  endosomes  sort  material  internalized  into 
early  endosomes  and  are  responsible  for  effluxing 
internalized  material  back  to  the  cellular  membrane  [72],  If 
labeled-MCoTI-l  was  localized  in  recycling  endosomes,  it 
would  be  expected  that  its  concentration  in  the  cell  would 

Figure  7.  Disruption  of  actin  does  not  inhibit  MCoTI-l  uptake.  (A)  HeLa  cells  were  untreated 
(control)  or  treated  with  Lat  B  (2  pM)  for  30  min  at  37°C  prior  to  addition  of  25  pM  MCoTI-l.  Following 
uptake  for  1  hr  at  37qC,  the  cells  were  imaged  using  confocal  fluorescence  microscopy.  Bar  =  10  pm.  (B) 
HeLa  cells  without  treatment  (control)  or  treated  with  2  pM  Lat  B  for  30  min  at  37°C  were  fixed  and 
labeled  with  rhodamine — phalloidin  to  label  actin  (red)  and  DAPI  to  label  nuclei  (blue).  Bar  =  10  pm. 
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decrease  and/or  accumulate  on  the  membrane  over  time,  which  was  not  the  case  in  the  time 
course  experiment  following  the  cellular  fate  of  internalized  cyclotide  (Fig.  4).  To  explore  the 
potential  localization  of  labeled-MCoTI-l  in  lysosomes  we  first  used  LysoTracker  Red  (LysoRed). 
This  pH  sensitive  fluorescent  probe  is  utilized  for  identifying  acidic  organelles,  such  as  lysosomes 
and  late  endosomes,  in  live  cells.  As  shown  in  Figure  8A,  significant  colocalization  (60  ±  4.0%  as 
determined  by  pixel  colocalization  analysis)  of  LysoRed  and  AF488-MCoTI-l  was  observed  after 
treating  the  cells  for  1h  with  both  agents.  As  an  extension  of  these  experiments,  we  also 
investigated  the  colocalization  of  labeled-MCoTI-l  and  lysosomal-associated  membrane  protein  1 
(Lampl),  an  established  mature  lysosomal  marker  [73,  74],  For  this  experiment,  live  HeLa  cells 
were  first  infected  with  a  Red  Fluorescent  Protein  (RFP)-Lampl -expressing  BacMam  virus.  The 


next  day  the  cells  were  incubated  with  AF488-MCoTI-l  for  1  h  and  imaged.  As  shown  in  Figure  8B, 
colocalization  was  also  seen  for  AF488-MCoTI-l  and  RFP-Lampl  (38  ±  5%,  as  determined  by  pixel 
colocalization  analysis),  suggesting  that  even  after  1  h,  significant  MCoTI-l  has  already  reached 
the  lysosomal  compartments.  Our  data  suggest  that  after  1  h,  a  significant  amount  of  MCoTI-l 
(==40%)  has  trafficked  through  the  endosomal  pathway  to  the  lysosomes  and  that  =20%  is  already 
localized  in  late  endosomes  or  other  types  of  acidic  organelles.  It  has  previously  been  reported  that 
the  perinuclear  steady-state  distribution  of  lysosomes  is  a  balance  between  movement  on 
microtubules  and  actin  filaments  [75-77].  Likewise,  movement  from  early  endosomal 
compartments  to  late  endosomes  to  lysosomes  has  also  been  shown  to  rely  on  the  microtubule 
network  [78,  79].  As  an  extension  of  these  experiments,  and  to  investigate  whether  MCoTI-l- 
containing  vesicles  were  actively  trafficking  inside  the  cell,  we  captured  time-lapse  video  of  cells 
after  incubation  with  MCoTI-l  for  1  h.  Indeed,  the  time-lapse  capture  showed  active  movements  of 
MCoTI-l-containing  vesicles  (Fig.  9).  Directed  short-  and  long-range  movements  could  be  seen, 
characteristic  of  movement  on  cytoskeletal  filaments.  These  results  suggest  that  while  a  large 
portion  of  MCoTI-l  has  reached  lysosomal  compartments  by  1  h,  and  some  of  the  movements  seen 
may  be  attributed  to  the  steady-state  distribution  of  lysosomes,  the  remaining  cyclotide  may  still  be 
trafficking  through  the  cell  from  other  membrane  compartments,  likely  within  late  endosomes. 
a  B  Craik  and  co-workers  have 

recently  reported  the  uptake 
of  biotinylated-MCoTI-ll  by 
human  macrophages  and 
breast  cancer  cell  lines  [35]. 
This  work  concluded  that  the 
uptake  of  MCoTI-ll  in 
macrophages  is  mediated 
by  macropinocytosis  and 
that  the  cyclotide 
accumulates  in 
macropinosomes  without 
trafficking  to  the  lysosome. 
MCoTI-ll  shares  high 
homology  with  MCoTI-l 
(=97%  homology,  see  Fig. 

1)  and  similar  biological 
activity.  Despite  their 
similarities,  the  differences 
in  the  cellular  uptake  and 
trafficking  of  MCoTI- 
cyclotides  by  macrophages 
versus  FleLa  cells  could  be 
attributed  to  the  cellular 
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Figure  8.  MCoTI-l  is  colocalized  with  lysosomal  compartments.  (A)  Untreated  or  BacMam-RFP- 
Lampl  treated  HeLa  cells  were  incubated  with  25  pM  MCoTI-l  and  LysoTracker  Red,  or  MCoTI-l 
alone,  for  1  hr  at  37 °C  as  described  in  Materials  and  Methods  and  then  imaged.  Bar  =  10  pm.  (B) 
Quantification  of  %  of  total  fluorescent  AF488-MCoTI-l  pixel  colocalization  with  fluorescent  pixels 
associated  with  both  markers  was  done  using  the  Zeiss  LSM  software  for  image  analysis,  (n  =  14  cells 
for  LysoTracker  Red  and  n  =  11  cells  for  Lampl  with  cells  selected  from  3  separate  experiments). 


differences  in  endocytic  preferences  for  these  two  very  different  cell  types.  Macrophage  cells  are 
specialized  in  large  scale  sampling  of  extracellular  fluid  using  macropiniocytosis  as  the  dominant 
endocytic  pathway.  Meanwhile,  other  types  of  cells  may  use  multiple  endocytic  pathways  as  has 
been  recently  shown  for  the  uptake  of  different  CPPs  in  FleLa  cells  [58]. 


At  this  point  we  cannot  be 
certain  if  some  labeled  MCoTI-l 
is  able  to  escape  from 
endosomal/lysosomal 
compartments  into  the  cytosol. 
The  ability  to  track  the  release 
of  fluorescent-labeled  molecules 
from  cellular  vesicles  is  limited 
using  live  cell  imaging  of 
fluorescence  signal  primarily 
due  to  the  large  dilution  effect  if 
the  molecule  is  able  to  escape 


Figure  9.  MCoTI-l-containing  vesicles  are  in  motion.  HeLa  cells  were  incubated  with  25  pM  MCoTI-l 

for  1  hr  at  37 °C  and  then  imaged  using  time-lapse  microscopy  as  described  in  Materials  and  Methods. 
Arrows  indicate  position  of  the  moving  vesicle  at  0  min  while  displacement  of  the  fluorescent  vesicle 
relative  to  the  arrow  shows  the  extent  of  movement  over  time.  Bar  =  2  pm. 


the  highly  confined  volume  of  the  vesicle  into  the  larger  cytosolic  volume.  One  way  to  demonstrate 
the  release  of  peptide  into  the  cytosol,  however,  would  be  by  using  labels  with  better  detection 
sensitivity  or  incorporating  a  biological  activity  that  can  be  measured  in  the  cellular  cytosol.  For 
example,  the  presence  of  Tyr  residues  in  both  MCoTI-cyclotides  should  facilitate  the  incorporation 
of  radioactive  iodine  into  the  phenolic  ring  of  Tyr  with  minimal  disruption  of  the  native  structure  of 
the  cyclotide.  The  incorporation  or  grafting  of  biological  peptides  into  the  MCoTI  scaffold  could  also 
provide  proof  of  endosomal/lysosomal  escape  if  such  biological  activity  could  be  measured  only  in 
the  cytosol.  This  approach  has  been  already  used  to  demonstrate  endosomal  escape  of  CPPs 
such  as  the  TAT  peptide  [80,  81].  The  retention  of  fluorescence  signal  in  the  perinuclear, 
lysosomal  compartments  for  a  period  of  up  to  10  hrs  suggests  that  most  of  the  cyclotide  remains 
within  these  compartments  however;  given  the  flexibility  of  the  cyclotide  backbone  to 
accommodate  multiple  peptide  sequences,  subsequent  studies  may  explore  the  ability  of  targeting 
and  endosomolytic  sequences  for  concomitant  targeted  entry  and  endosomal/lysosomal  escape 
into  cytosol. 

Conclusion 

This  study  reports  on  the  first  analysis  of  intracellular  uptake  of  MCoTI-l  cyclotide  using  live 
cell  imaging  by  confocal  fluorescence  microscopy.  Cyclotides  represent  a  novel  new  platform  for 
drug  development.  Their  stability,  conferred  by  the  cyclic  cystine  knot,  their  small  size,  their 
amenability  to  both  chemical  and  biological  synthesis  and  their  flexible  tolerance  to  sequence 
variation  make  them  ideal  for  grafting  of  biologically-active  therapeutic  epitopes.  As  we  show 
herein,  they  are  also  capable  in  the  unmodified  state  of  utilizing  multiple  cellular  endocytic 
pathways  for  internalization.  Their  ease  of  access  makes  them  readily  accessible  in  their  current 
state  to  endosomal/lysosomal  compartments  of  virtually  any  cell.  Without  an  apparent  strong 
preference  for  an  existing  cellular  pathway  nor  surface-expressed  epitope  in  HeLa  cells  (nor  in 
other  studies  with  MCoTI-ll  in  macrophages  [35]),  they  appear  highly  amenable  to  retargeting  to 
exploit  a  particular  target  cell’s  dominant  internalization  pathway  and/or  unique  surface  receptor 
repertoire,  along  with  the  targeted  introduction  of  biologically-active  therapeutic  motifs. 
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Cyclotides  are  fascinating  circular 
proteins  ranging  from  28  to  37 
aa  residues  that  are  naturally 
expressed  in  plants.  They  exhibit 
antimicrobial,  insecticidal,  antihelmintic, 
cytotoxic,  and  antiviral  activities  (1),  and 
protease  inhibitory  activity  (2),  and  can 
exert  uterotonic  effects  (3).  They  all  share 
a  unique  head-to-tail  circular  knotted  to¬ 
pology  of  three  disulfide  bridges,  with  one 
disulfide  bond  penetrating  through  a  mac¬ 
rocycle  formed  by  the  other  two  disulfides 
bonds  and  interconnecting  peptide  back¬ 
bones,  forming  what  is  called  a  cystine  knot 
topology  (Fig.  1)  (1).  This  cyclic  cystine 
knot  framework  gives  cyclotides  a  compact, 
highly  rigid  structure  (4),  which  confers 
exceptional  resistance  to  thermal/chemical 
denaturation  and  enzymatic  degradation 
(5),  thereby  making  cyclotides  a  promis¬ 
ing  molecular  scaffold  for  drug  discovery 
(6,  7).  So  far,  cyclotides  have  been  discov¬ 
ered  in  plants  from  the  Rubiaceae  (coffee), 
Violaceae  (violet),  and  Cucurbitaceae 
(squash)  families  (8,  9),  and  more  recent¬ 
ly  in  the  Fabaceae  (legume)  family  (Fig.  1) 
(10).  The  discovery  of  cyclotides  in  the 
Fabaceae  family  of  plants  represents  an 
important  new  development  because  this 
family  of  plants  is  the  third  largest  on 
Earth,  comprising  approximately  18,000 
different  species.  Some  of  these  species 
are  widely  used  as  crops  in  human  nutri¬ 
tion  and  food  supply.  This  opens  the  in¬ 
triguing  possibility  of  using  these  plants  for 
the  large-scale  production  of  cyclotides 
with  pharmaceutical  or  agrochemical 
properties  by  using  transgenic  crops.  The 
key  to  accomplishing  that,  however,  is  to 
have  a  better  understanding  of  the  mech¬ 
anism  that  produces  these  interesting 
microproteins  in  this  family  of  plants. 

The  report  by  Poth  et  al.  in  PNAS  (11) 
brings  us  closer  to  that  exciting  possibility  by 
describing  the  gene  encoding  the  protein 
precursor  of  a  unique  cyclotide  (Cter  M) 
isolated  from  the  leaf  of  butterfly  pea  ( Clit - 
oria  tematea),  a  representative  member  of 
the  Fabaceae  plant  family.  All  the  cyclotides 
reported  so  far  from  the  Violaceae  and  Ru¬ 
biaceae  families  are  biosynthesized  via  pro¬ 
cessing  from  dedicated  genes  that,  in  some 
cases,  encode  multiple  copies  of  the  same 
cyclotide,  and  in  others,  mixtures  of  different 
cyclotide  sequences  (Fig.  1)  (12).  Poth  et  al. 
(11)  reveal  that  the  sequence  encoding  the 
cyclotide  Cter  M,  however,  is  embedded 
within  the  albumin- 1  gene  of  C.  tematea  (Fig. 
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Fig.  1.  Genetic  origin  of  cyclotides  from  different  plant  families.  Rubiaceae  ( Oldenlandia  affinis)  and  Vi¬ 
olaceae  ( Viola  odorata)  plants  have  dedicated  genes  for  the  production  of  cyclotides  (12).  These  cyclotide 
precursors  comprise  an  ER  signal  peptide,  an  N-terminal  Pro  region,  the  N-terminal  repeat  (NTR),  the  mature 
cyclotide  domain,  and  a  C-terminal  flanking  region  (CTR).  In  contrast,  the  CterM  gene  (C.  ternatea,  Faba¬ 
ceae)  shows  an  ER  signal  peptide  immediately  followed  by  the  cyclotide  domain,  which  is  flanked  at  the  C 
terminus  by  a  peptide  linker  and  the  albumin  a-chain.  The  Cter  M  cyclotide  domain  replaces  albumin-1  fa- 
chain.  The  genetic  origin  of  the  Cucurbitaceae  cyclotides  (found  in  the  seeds  of  M.  cochinchinensis)  remain 
to  be  identified. 


1).  Plant  albumins  are  part  of  the  nutrient 
reservoir,  but  they  also  play  a  role  in  host 
defense.  Generic  albumin- 1  genes  are  com¬ 
prised  of  an  ER  signal  sequence  followed  by 
an  albumin  chain-b,  a  linker,  and  an  albumin 
chain-a.  In  the  precursor  of  cyclotide  Cter  M, 
the  cyclotide  domain  replaces  the  albumin 
chain-b  domain.  This  interesting  finding  rai¬ 
ses  the  question  of  how  this  replacement  took 
place  in  evolution.  There  are  two  possibili¬ 
ties:  (i)  gradual  evolution  of  the  chain-b  do¬ 
main  into  the  cyclotide  domain  or  (ii)  rapid 
lateral  transfer  of  the  cyclotide  gene  into  the 
albumin  gene.  Poth  et  al.  (11)  present  evi¬ 
dence  supporting  a  gradual  evolutionary  path, 
whereby  the  albumin-1  chain-b  slowly  evolved 
into  a  more  stable  cyclotide  domain.  For  ex¬ 
ample,  the  pea  albumin-1  subunit-b  (PAlb), 
one  of  the  best -studied  Fabaceae  albumin 
components,  is  a  37-aa  peptide  from  pea 
seeds  ( Pisum  sativum),  which  also  contains 
a  cystine-knot  structure  (13).  Remarkably, 


the  cystine -knot  core  of  PAlb  overlays  ex¬ 
tremely  well  with  that  of  the  cyclotide  Cter  M. 
The  composition  and  size  of  the  PAlb  loops 
are,  however,  totally  different  from  those  of 
the  cyclotide  Cter  M.  Recent  mutagenesis 
studies  on  PAlb  have  also  recently  shown 
that  this  albumin  domain  is  highly  tolerant  to 
mutations  outside  the  cystine  knot  core  (14). 
These  observations  support  the  possibility  of 
divergent  evolution  of  cyclotides  from  ances¬ 
tral  albumin  domains,  wherein  evolution  and 
natural  selection  provided  an  alternative  loop 
decoration  of  the  original  cystine  knot 
albumin  core. 

A  final  question  remains.  Linear  cystine 
knot  proteins  such  as  PAlb  are  not  backbone 
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cyclized  like  cyclotides.  What  made  the  cy- 
clization  process  possible,  allowing  the  final 
transformation  of  an  evolved  cystine -knot 
albumin  domain  into  a  cyclotide?  As  in¬ 
dicated  by  Poth  et  al.  (11),  the  structural 
analysis  of  PAlb  may  reveal  some  clues 
about  how  this  could  have  happened.  The 
NMR  structure  of  PAlb  (13)  reveals  that  the 
N-  and  C-termini  are  very  close  to  each  other, 
and  it  is  possible  that  mutations  in  the  albu¬ 
min  genes  predisposed  them  to  cyclization 
during  the  evolution  process. 

However,  what  type  of  mutations  could 
allow  the  backbone  cyclization  of  a  linear 
cystine  knot  albumin  domain?  Although  the 
complete  mechanism  of  how  cyclotide  pre¬ 
cursors  are  processed  and  cyclized  is  not 
been  fully  characterized  yet,  recent  studies 
indicate  that  an  asparaginyl  endopeptidase 
(AEP;  also  known  as  vacuolar  processing 
enzyme  or  legumain)  is  a  key  element  in  the 
cyclization  of  cyclotides  (15,  16).  It  has  been 
proposed  that  the  cyclization  step  mediated 
by  AEP  takes  place  at  the  same  time  as  the 
cleavage  of  the  C-terminal  propeptide  from 
the  cyclotide  precursor  protein  through 
a  transpeptidation  reaction  (15).  The  trans- 
peptidation  reaction  involves  an  acyl-trans- 
fer  step  from  the  acyl-AEP  intermediate  to 
the  N-terminal  residue  of  the  cyclotide  do¬ 
main  (16).  A  similar  process  has  been  used 
for  the  chemical  (17),  chemoenzymatic  (18), 
and  recombinant  (19)  production  of  cyclo¬ 
tides.  AEPs  are  Cys  proteases  that  are  very 
common  in  plants  and  are  able  to  specifically 
cleave  the  peptide  bond  at  the  C  terminus  of 
Asn  and,  less  efficiently,  Asp  residues.  All 
the  cyclotide  precursors  identified  so  far, 
including  those  from  C.  tematea ,  contain 
a  well  conserved  Asn/Asp  residue  at  the 
C  terminus  of  the  cyclotide  domain,  which 
is  consistent  with  the  idea  that  cyclotides  are 
cyclized  by  a  transpeptidation  reaction 
mediated  by  AEP  (15). 

Despite  these  similarities,  C.  tematea  cy¬ 
clotides  also  show  some  differences  re¬ 
garding  the  residue  immediately  following 
the  mechanistically  conserved  Asn.  In  the 
cyclotide  precursors  from  the  Violaceae  and 
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Rubiaceae  families,  the  C-terminal  Asn/Asp 
residue  is  always  followed  by  a  small  amino 
acid,  either  Gly  or  Ser.  However,  the  Cter  M 
precursor  reported  by  Poth  et  al.  (11)  in¬ 
dicates  that  a  small  amino  acid  is  not  always 
required  in  that  position.  Moreover,  some  C. 
tematea  cyclotides  also  have  a  His  residue  at 
the  N  terminus  of  the  cyclotide  precursor 
rather  than  the  most  common  Gly  residue 
found  in  most  cyclotide  domains  (10).  These 
observations  seem  to  indicate  that,  at  least  in 
the  Fabaceae  family,  the  AEP-mediated 
transpeptidation  step  may  be  more  tolerant 
than  previously  recognized. 

The  finding  that  albumin  genes  can  evolve 
into  protein  precursors  that  can  be  sub¬ 
sequently  processed  to  become  cyclic  was 
described  in  a  recent  report  on  the  bio¬ 
synthesis  of  the  sunflower  trypsin  inhibitor 
peptide,  SFTI-1  (20).  SFTI-1  is  a  14-residue 
peptide  isolated  from  sunflower  seeds  with 
a  head-to-tail  cyclic  backbone  structure  hav¬ 
ing  only  a  single  disulfide  bond.  In  this  case, 
the  SFTI-1  linear  precursor  is  embedded 
within  a  “napin-type”  2S  albumin. 

The  report  by  Poth  et  al.  (11)  indicates 
that  the  biosynthetic  origin  of  some  cyclo¬ 
tides  are  very  different  from  others,  which 
could  suggest  that  cyclic  peptides  might  be 
more  widely  distributed  than  is  currently 
realized.  The  exceptional  stability  of  back- 
bone-cyclized  peptides  may  give  them  an 
evolutionary  advantage,  which  may  provide 
the  driving  force  for  the  evolution  of  multi¬ 
ple  biosynthetic  pathways  including  the  use 
of  dedicated  or  recycled  genes,  with  albu¬ 
mins  now  being  implicated  in  the  bio¬ 
synthesis  of  two  different  classes  of 
cyclic  peptides. 

In  this  context,  it  is  worth  noting  that  the 
protein  precursors  of  the  only  two  cyclotides 
isolated  so  far  from  the  Cucurbitaceae  plant 
family,  Momordica  cochinchinensis  trypsin 
inhibitor  I  and  II  (MCoTI-I/II;  Fig.  1),  re¬ 
main  yet  to  be  identified.  These  cyclotides 
are  found  in  the  seeds  of  M.  cochinchinensis 
(a  tropical  squash  plant)  and  are  potent 
trypsin  inhibitors.  MCoTI  cyclotides  do  not 
share  significant  sequence  homology  with  the 
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other  cyclotides  beyond  the  presence  of  the 
three-cystine  bridges  that  adopt  a  similar 
backbone-cyclic  cystine-knot  topology  (Fig. 

1)  and  are  more  related  to  linear  cystine-knot 
squash  trypsin  inhibitors.  In  fact,  an  acyclic 
version  of  MCoTI-cyclotides  (known  as 
MCoTI-III)  can  also  be  found  in  the  seeds  of 
M.  cochinchinensis.  This  situation,  in  which 
the  cyclic  and  linear  versions  of  the  cys-knot 
protein  coexist  in  the  same  organism,  pro¬ 
vides  a  unique  opportunity  to  study  the  ge¬ 
netic  origin  and  evolution  of  these  interesting 
molecules.  Identification  of  the  protein  pre¬ 
cursors  for  the  cyclic  and  linear  versions  of 
these  cystine-knot  trypsin  inhibitors  should 
provide  a  unique  snapshot  in  the  evolution¬ 
ary  process  of  plant  cyclic  cystine- 
knot  proteins. 

In  summary,  the  work  by  Poth  et  al.  (11) 
provides  critical  information  on  the  origin,  q:s 
evolution,  and  processing  of  cyclotides  from 
a  plant  of  the  Fabaceae  family.  The  discov¬ 
ery  of  unique  cyclo tides  as  well  as  other  cy-  q:9 
die  peptides  from  a  wide  range  of  plants  is 
key  to  define  and  fully  understand  the  dif¬ 
ferent  cyclization  mechanisms  used  by 
plants.  So  far,  the  expression  of  cyclotides  in 
transgenic  plants  has  been  attempted  only  in 
Arabidopsis  and  tobacco  (15,  16),  in  which 
cyclotide  expression  is  highly  inefficient, 
giving  rise  to  mostly  acyclic  or  truncated 
proteins.  The  proven  ability  of  C.  tematea  to 
produce  fully  folded  cyclotides  seems  to 
suggest  that  other  species  of  the  Fabaceae 
family  could  also  be  used  for  the  production 
of  cyclotides.  Several  members  of  this  large 
family  of  plants  are  agricultural  crops,  which 
opens  the  intriguing  possibility  of  generating 
genetically  engineered  crops  for  the  large- 
scale  production  of  cyclotides  with  useful 
pharmacological  or  agrochemical  properties 
in  the  near  future. 
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ABSTRACT  Protein  microarray  technology  possesses  some 
of  the  greatest  potential  for  providing  direct  information  on 
protein  function  and  potential  drug  targets.  For  example, 
functional  protein  microarrays  are  ideal  tools  suited  for  the 
mapping  of  biological  pathways.  They  can  be  used  to  study 
most  major  types  of  interactions  and  enzymatic  activities  that 
take  place  in  biochemical  pathways  and  have  been  used  for  the 
analysis  of  simultaneous  multiple  biomolecular  interactions 
involving  protein-protein,  protein-lipid,  protein-DNA  and 
protein-small  molecule  interactions.  Because  of  this  unique 
ability  to  analyze  many  kinds  of  molecular  interactions  en 
masse,  the  requirement  of  very  small  sample  amount  and  the 
potential  to  be  miniaturized  and  automated,  protein  micro¬ 
arrays  are  extremely  well  suited  for  protein  profiling,  drug 
discovery,  drug  target  identification  and  clinical  prognosis  and 
diagnosis.  The  aim  of  this  review  is  to  summarize  the  most 
recent  developments  in  the  production,  applications  and 
analysis  of  protein  microarrays. 

KEY  WORDS  drug  discovery  ■  protein  chips  ■  protein 
immobilization  ■  protein  profiling  ■  proteomics 

INTRODUCTION 

Protein  microarray  technology  has  made  enormous  prog¬ 
ress  in  the  last  decade,  increasingly  becoming  an  important 
research  tool  for  the  study  and  detection  of  proteins, 
protein-protein  interactions  and  numerous  other  biotech¬ 
nological  applications  (1—4).  The  use  of  protein  microarrays 
has  advantages  over  more  traditional  methods  for  the  study 
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of  molecular  interactions.  They  require  low  sample  con¬ 
sumption  and  have  potential  for  miniaturization.  Protein 
microarrays  displaying  multiple  biologically  active  proteins 
simultaneously  have  the  potential  to  provide  high- 
throughput  protein  analysis  in  the  same  way  DNA  arrays 
did  for  genomics  research  a  decade  ago.  This  is  a  feature 
that  is  extremely  important  for  the  analysis  of  protein 
interactions  at  the  proteome-scale.  The  transition  from 
DNA  to  protein  microarrays,  however,  has  required  the 
development  of  specially  tailored  protein  immobilization 
methods  that  ensure  the  protein  structure  and  biological 
function  after  the  immobilization  step.  Several  technologies 
have  been  developed  in  the  last  few  years  that  allow  the 
site-specific  immobilization  of  proteins  onto  solid  supports 
for  the  rapid  production  of  protein  microarrays  using  high 
throughput  expression  systems,  such  as  cell-free  expression 
systems  (5—7).  The  development  of  appropriate  detection 
systems  to  monitor  protein  interactions  has  also  been  an 
important  challenge  for  the  optimal  use  of  protein  micro¬ 
arrays.  The  use  of  techniques  such  as  fluorescence  imaging, 
mass-spectrometry  (MS)  and  surface  plasmon  resonance 
(SPR)  were  recently  developed  and  adapted  to  be  inter¬ 
faced  with  protein  micro-arrays.  During  the  last  decade,  a 
number  of  excellent  reviews  have  appeared  in  the  literature 
describing  the  concept,  preparation,  analysis  and  applica¬ 
tions  of  protein  microarrays,  highlighting  the  increasing 
importance  of  this  technology  (1—4,8).  The  aim  of  this 
review  is  to  summarize  the  latest  developments  in  protein 
microarray  technology  in  the  areas  of  protein  immobiliza¬ 
tion,  novel  protein  detection  schemes  and  applications  of 
this  promising  technology. 

PROTEIN  MICROARRAYS 

Protein  microarrays  are  usually  divided  in  two  groups: 
functional  protein  microarrays  and  protein-detecting 
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microarrays  (Fig.  1)  (2,9).  Protein  function  microarrays  are 
made  by  the  immobilization  of  different  purified  proteins, 
protein  domains  or  functional  peptides.  These  types  of 
microarrays  are  generally  used  to  study  molecular  inter¬ 
actions  and  screen  potential  interacting  partners.  On  the 
other  hand,  protein-detecting  microarrays  are  made  by  the 
immobilization  of  specific  protein  capture  reagents  that  can 
specifically  recognize  particular  proteins  from  complex 
mixtures.  These  microarrays  are  used  for  protein  profiling, 
i.e.  quantification  of  protein  abundances  and  evaluation  of 
post-translational  modifications  in  complex  mixtures. 

Functional  Protein  Microarrays 

Understanding  the  network  of  molecular  interactions  that 
defines  a  particular  proteome  is  one  of  the  main  goals  of 
functional  proteomics.  Functional  protein  microarrays 
provide  an  extremely  powerful  tool  to  accomplish  this 
daunting  task,  especially  when  assessing  the  activity  of 
families  of  related  proteins.  In  2000,  Schreiber  and  co¬ 
workers  showed  that  purified  recombinant  proteins  could 
be  microarrayed  onto  chemically  derivatized  glass  slides 
without  seriously  affecting  their  molecular  and  functional 
integrity  (10).  More  recently,  Snyder  and  co-workers  have 
been  able  to  immobilize  ~5,800  proteins  from  Sacharomyces 
cerivisiae  onto  microscope  glass  slides  (11).  This  protein  chip 
was  then  probed  with  different  phospholipids  to  identify 
several  lipid-binding  proteins.  The  same  authors  also  used 
this  proteome  chip  for  the  identification  of  substrates  for  87 


different  protein  kinases  (12).  Using  this  microarray  data  set 
in  combination  with  protein-protein  interaction  and  tran¬ 
scription  factor  binding  data,  the  authors  were  able  to 
reveal  several  novel  regulatory  modules  in  yeast  (12).  Using 
a  similar  approach,  Dinesh-Kumar  and  co-workers  were 
able  to  construct  a  protein  microarray  containing  2,158 
unique  Arabidopsis  thaliana  proteins.  This  array  was  used  for 
the  identification  of  570  phosphorylation  substrates  of 
mitogen-activated  protein  kinases,  which  included  several 
transcription  factors  involved  in  the  regulation  of  develop¬ 
ment,  host  immune  defense,  and  stress  responses  (13).  The 
analysis  of  proteome-wide  microarrays  from  yeast  was  also 
recently  used  to  find  unexpected  non-chromatin  substrates 
for  the  essential  nucleosomal  acetyl  transferase  of  H4 
(NuA4)  complex  (14).  In  this  interesting  work,  the  authors 
discovered  that  NuA4  is  a  natural  substrate  for  the 
metabolic  enzyme  phosphoenolpyruvate  carboxykinase 
and  that  its  acetylation  is  critical  for  regulating  the 
chronological  lifespan  of  yeast  (14).  In  another  example, 
human  proteome  arrays  were  used  for  the  detection  of 
autoimmune  response  markers  in  several  human  cancers 
(15,16).  Kirschner  and  co-workers  have  also  used  human 
proteome  arrays  to  identify  novel  substrates  of  the 
anaphase-promoting  complex  (17).  This  was  accomplished 
by  probing  the  arrays  with  cell  extracts  that  replicate  the 
mitotic  checkpoint  and  anaphase  release  and  then  probing 
the  captured  proteins  with  antibodies  specific  for  detecting 
poly-ubiquitination  (17).  Functional  protein  microarrays 
have  also  been  used  to  study  families  of  interacting  protein 
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Fig.  I  Common  formats  used  for  the  preparation  of  protein  microarrays.  Functional  protein  microarrays  (A)  are  used  to  study  and  identify  new 
molecular  interactions  between  proteins,  small  molecules  or  enzyme  substrates,  for  example.  Protein  detecting  microarrays  (B)  are  used  to  identify 
proteins  from  complex  mixtures.  In  the  sandwich  format  (B.  left),  captured  proteins  are  detected  by  a  secondary  antibody  typically  labeled  with  a 
fluorescent  dye  to  facilitate  detection  and  quantification.  In  contrast  to  antibody  microarrays,  lysate  microarrays  (B,  right)  are  typically  immobilized  onto 
nitrocellulose-coated  glass  slides  (FAST  slides)  and  detected  using  fluorescent-labeled  solution-phase  specific  antibodies. 
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domains.  Bedford  and  co-workers  have  shown  that  several 
protein  domains  (FF,  FHA,  PIT,  PDZ,  SH2,  SH3,  and 
WW)  can  be  immobilized  onto  a  microarray  format, 
retaining  their  ability  to  mediate  specific  interactions  (18). 
Similar  approaches  were  used  to  study  the  interactions 
associated  with  WW  domains  in  yeast  (19)  and  Kaposi- 
sarcoma  viral  proteins  and  the  host  endocytic  machinery 
(20),  and  to  evaluate  the  interactions  between  different 
proline-rich  peptides  derived  from  the  myelin  basic  protein 
and  several  SF13  domains  (21). 

Functional  protein  domain  microarrays  can  also  be  used 
to  quantify  protein  interactions.  For  example,  in  2004 
Blackburn  and  co-workers  used  microarrays  containing 
multiple  variants  of  the  transcription  factor  p53  to  study 
and  quantify  their  DNA-binding  preferences  (22).  By  using 
fluorescent-labeled  DNA  probes,  the  authors  were  able  to 
produce  binding  isotherms  and  extract  the  different 
equilibrium  dissociation  constants  for  every  p53  variant 
(22).  MacBeath  and  co-workers  have  also  used  a  similar 
approach  to  quantify  the  interactions  of  several  human 
SH2  and  PTB  domains  with  different  phosphotyrosine- 
containing  peptides  derived  from  human  ErbB  receptors 
(Fig.  2)  (23).  This  type  of  protein  microarray  provides  a 
unique  way  to  study  the  binding  properties  of  complete 
families  of  proteins  and/ or  protein  domains  in  an  unbiased 
way.  In  addition,  they  have  the  potential  to  generate  data 
that,  when  collected  in  a  quantitative  way,  could  be  used 
for  training  predictive  models  of  molecular  recognition  (24— 
26).  As  a  recent  example,  MacBeath  and  co-workers 
recently  used  functional  microarrays  containing  multiple 
murine  PDZ  protein  domains  to  screen  potential  interac¬ 
tions  with  217  genome-encoded  peptides  derived  from  the 
murine  proteome  (24,25).  The  data  generated  was  used  to 
train  a  multidomain  selectivity  model  that  was  able  to 
predict  PDZ  domain-peptide  interactions  across  the  mouse 
proteome.  Interestingly,  the  models  showed  that  PDZ 
domains  are  not  grouped  into  discrete  functional  classes; 
instead,  they  are  uniformly  distributed  throughout  the 
selectivity  space.  This  finding  strongly  suggests  that  the 
PDZ  domains  across  the  proteome  are  optimized  to 
minimize  cross-reactivity  (24,25). 

Protein-Detecting  Microarrays 

As  described  above,  functional  protein  microarrays  allow 
high-throughput  screening  and  quantification  of  protein 
interactions  on  a  proteome-wide  scale,  thus  providing  an 
unbiased  perspective  on  the  connectivity  of  the  different 
protein-protein  interaction  networks.  Establishing  how  this 
information  flows  through  these  interacting  networks, 
however,  requires  measuring  the  abundance  and  post- 
translational  modifications  of  many  proteins  from  complex 
biological  mixtures.  Protein-detecting  microarrays  are  ideal 


reagents  for  this  type  of  analysis.  One  of  the  most  frequently 
used  strategies  to  prepare  this  type  of  microarray  involves 
the  use  of  monoclonal  antibodies  as  specific  protein  capture 
reagents.  Antibodies  have  been  classically  well  suited  for 
this  task,  since  there  are  a  large  number  of  commercially 
available  specific  antibodies,  which  can  be  easily  immobi¬ 
lized  onto  solid  supports  (4,27—30).  However,  the  potential 
problems  associated  with  the  use  of  antibodies  for  chip 
assembly,  which  might  manifest  themselves  through  mod¬ 
erate  expression  yields  and  by  issues  related  to  the  stability 
and  solubility  of  these  large  proteins,  have  led  to  the 
exploration  of  alternative  protein  scaffolds  as  a  source  for 
new,  more  effective  and  stable  protein  capture  reagents 
(24,31,32).  Suitable  protein  scaffolds  that  have  been 
proposed  include  fibronectin  domains,  the  Z  domain  of 
protein  A,  lipocalins  and  cyclotides,  among  others. 

In  general,  antibody  microarrays  are  well  suited  for 
detecting  changes  in  the  abundances  of  proteins  in 
biological  samples  with  a  relatively  large  dynamic  range 
(33).  For  example,  Haab  and  co-workers  made  use  of 
antibody  microarrays  for  serum-protein  profiling  in  order 
to  identify  potential  biomarkers  in  prostate  cancer  (33). 
Using  this  approach,  the  authors  were  able  to  identify  five 
proteins  (immunoglobulins  G  and  M,  a  1 -anti-chymotrypsin, 
villin  and  the  Von  Willebrand  factor)  that  had  significantly 
different  levels  of  expression  between  the  prostate  cancer 
samples  and  control  samples  from  healthy  individuals. 

In  a  similar  fashion  to  that  of  a  sandwich  ELISA  assay, 
quite  often,  antibody  microarrays  make  use  of  a  second 
antibody  directed  towards  a  different  epitope  of  the  protein 
to  be  analyzed.  This  facilitates  the  detection  and  quantifi¬ 
cation  of  the  corresponding  analyte.  This  approach  has 
been  used  for  monitoring  changes  in  the  phosphorylation 
state  of  host  proteins  (34),  including  receptor  tyrosine 
kinases  (35),  and  for  serum  protein  profiling  to  identify 
new  biomarkers  in  prostate  cancer  (36)  among  other 
applications.  The  use  of  this  approach  is  usually  limited, 
however,  by  the  availability  of  suitable  antibodies  that  can 
be  used  for  capture  and  detection.  Moreover,  the  detection 
step  requires  the  simultaneous  use  of  multiple  fluorescent- 
labeled  antibodies,  which  may  increase  background  signal 
as  well  as  the  risk  of  cross-reactive  binding  as  the  number  of 
antibodies  increases.  A  way  to  overcome  this  problem  is  to 
label  the  proteins  in  the  biological  sample  to  be  analyzed 
using  one  or  more  fluorescent  dyes  (37).  This  approach 
allows  one  to  perform  ratiometric  comparisons  between 
different  samples  by  using  spectrally  distinct  fluorophores. 
This  strategy  has  been  employed  for  the  discovery  of 
molecular  biomarkers  in  different  types  of  human  cancer 
(38—40).  It  should  be  highlighted,  however,  that  non¬ 
specific  chemical  labeling  of  proteins  introduces  chemical 
modifications  on  their  surface  and,  therefore,  may  alter 
antibody  recognition  and  lead  to  false  signals.  Also,  this 
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Fig.  2  Quantitative  interaction 
networks  of  tyrosine  kinases  as¬ 
sociated  with  the  Erb  family  of 
receptors,  which  was  determined 
using  protein  microarrays  display¬ 
ing  96  SH2  and  37  PTB  domains. 
The  SH2  and  PTB  protein 
domains  were  probed  with  fluo- 
rescently  labeled  phosphopeptides 
representing  the  different  tyrosine 
phosphoryaltion  sites  on  the  Erb 
kinases.  The  readout  of  peptide 
binding  was  monitored  and  quan¬ 
tified  by  fluorescence.  The  inter¬ 
action  maps  (bottom  panel)  were 
constructed  from  the  quantitative 
interaction  data  ( 1 56).  Reprinted 
from  reference  ( I  56)  with  per¬ 
mission  from  Elsevier. 
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approach  requires  the  homogeneous  labeling  of  proteins 
across  different  samples,  which  in  most  cases  cannot  be 
completely  guaranteed.  These  drawbacks  can,  in  principle, 
be  avoided  by  using  a  label-free  detection  scheme. 
However,  nearly  all  of  the  different  methods  available  for 
this  task  (see  below)  still  lack  the  sensitivity  required  for 
most  biological  applications. 

Although  antibody  microarrays  are  well  suited  for 
protein  profiling,  proteome-wide  applications  have  not 
been  accomplished  yet.  This  is  mainly  due  to  the  lack  of 
available,  well-validated  antibodies.  An  ingenious  solution 
proposed  by  Lauffenburger  and  co-workers,  however,  is  to 
use  a  combination  of  different  experimental  approaches 
with  the  data  generated  by  microarrays  (41,42).  In  this 
work,  the  authors  combined  data  gathered  from  antibody 
microarrays,  enzymatic  assays,  immunoblotting,  and  flow 
cytometry  to  assemble  a  network  of  ~  10,000  interactions  in 
HT-29  cells  treated  with  different  combinations  of  cyto¬ 
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kines  (41).  All  of  this  information  was  later  used  to  uncover 
mechanisms  of  crosstalk  involving  pro-  and  anti-apoptotic 
signals  induced  by  different  cytokines  (42). 

Protein  Lysate  Microarrays 

An  interesting  alternative  to  antibody  microarrays  is  to 
immobilize  cell  lysates  and  then  use  specific  monoclonal 
antibodies  to  identify  and  quantify  the  presence  of  a 
particular  analyte  in  the  corresponding  lysate.  This  tech¬ 
nology  was  first  described  by  Liotta  and  co-workers  to 
monitor  pro-survival  checkpoint  proteins  as  a  function  of 
cancer  progression  (43).  The  same  approach  has  recently 
been  used  for  the  discovery  and  validation  of  specific 
biomarkers  for  disease  diagnosis  and  patient  stratification. 
Utz  and  co-workers  (44)  have  also  made  use  of  lysate 
microarrays  to  study  the  kinetics  of  intracellular  signaling 
by  tracking  62  phosphorylation  sites  in  stimulated  Jurkat 
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cells,  which  allowed  them  to  discover  a  previously  unknown 
connection  between  T-cell  receptor  activation  and  Raf-1 
activity  (44). 

In  protein  lysate  microarrays,  every  spot  in  the  micro¬ 
array  contains  the  entire  set  of  biological  proteins  to  be 
analyzed.  This  means  that  in  order  to  analyze  the 
abundance  and  modification  states  of  different  proteins 
present  in  the  lysate,  it  is  necessary  to  prepare  as  many 
copies  of  the  array  as  proteins  needed  to  be  analyzed. 
Lysate  microarrays  also  denature  the  proteins  to  be 
analyzed  during  the  immobilization  step  onto  the  solid 
support.  This  makes  it  impossible  to  study  complex  protein- 
protein  interactions  and  requires  the  use  of  specific  and 
well-validated  antibodies  for  the  recognition  of  specific 
continuous  protein  epitopes.  This  is  a  serious  limitation  of 
this  technique,  since  it  only  allows  the  analysis  of  proteins 
that  have  already  been  discovered  and  to  which  suitable 
antibodies  are  available.  In  this  regard,  it  should  be  noted 
the  majority  of  commercially  available  antibodies  typically 
show  substantial  cross-reactivity  issues  and,  therefore,  are 
not  appropriate  for  this  type  of  approach.  Only  antibodies 
able  to  provide  a  single  band  in  a  standard  Western  blot 
should  be  used.  Moreover,  the  blocking  and  detection 
protocols,  as  well  as  the  composition  of  the  lysis  buffer,  have 
been  shown  to  substantially  affect  antibody  performance 
(45),  therefore  indicating  that  further  developments  are 
required  for  the  widespread  use  of  this  technology. 

NOVEL  APPROACHES  FOR  PROTEIN 
IMMOBILIZATION 

The  immobilization  of  proteins  onto  solid  supports  has 
traditionally  relied  on  non-specific  adsorption  (46,47)  or 
covalent  crosslinking  of  naturally  occurring  chemical  groups 
within  proteins  (47—49).  These  approaches  usually  provide 
a  random  orientation  of  the  immobilized  protein  onto  the 
solid  support,  which  may  compromise  the  structural  and/ or 
functional  integrity  of  the  protein  (50).  This  is  a  key  issue 
for  the  fabrication  of  functional  protein  microarrays  as 
described  above.  The  use  of  recombinant  affinity  tags  as 
capture  reagents  offers  site-specific  immobilization.  The 
most  commonly  used  affinity  tags  include  biotin/avidin 
(51-53),  His-tag/Ni2+-nitriloacetic  acid  (11,54)  and 
glutathione-S-transferase  (GST)/ glutathione  (GSH)  (12,55). 

Immobilization  of  antibodies  through  the  Fc  region  onto 
protein  A-  or  protein  G-coated  surfaces  has  also  been  used 
for  the  creation  of  antibody  microarrays  (56,57).  Addition¬ 
ally,  thioredoxin  (58),  maltose-binding  protein  (59)  and 
chitin-binding  protein  (60)  have  also  been  developed  for  the 
immobilization  of  the  corresponding  fusion  proteins. 
Protein-DNA  conjugates  have  also  been  recently  reported 
for  DNA-directed  immobilization  (DDI)  of  proteins  onto 


complementary  DNA-microarrays  (61,62).  Most  of  these 
interactions,  however,  are  reversible  and  not  stable  over 
time  (63-67).  The  use  of  site-specific  chemical  ligation 
reactions  for  the  immobilization  of  proteins  overcomes  this 
limitation  by  allowing  the  proteins  to  be  arranged  in  a 
defined,  controlled  fashion  with  exquisite  chemical  control 
(see  references  (29,68,69)  for  recent  reviews  in  this  field). 
This  type  of  reaction  requires  two  unique  and  mutually 
reactive  groups  on  the  protein  and  the  solid  support  used 
for  the  immobilization  step  (Fig.  3).  Ideally,  the  reaction 
between  these  groups  should  be  highly  chemoselective  and 
compatible  with  physiological  conditions  to  avoid  denatur- 
ation  during  the  immobilization  step  (28,70).  Finally,  it 
should  be  desirable  that  these  unique  reactive  groups  could 
be  directly  engineered  into  the  proteins  to  be  immobilized 
by  using  standard  recombinant  expression  techniques. 

Most  of  the  chemoselective  methods  suitable  for  site- 
specific  immobilization  of  proteins  described  in  the  litera¬ 
ture  rely  on  ligation  methods  originally  designed  for  the 
chemical  engineering  of  proteins  (71—77).  Key  to  these 
methods  is  the  introduction  of  a  unique  reacting  group  at  a 
defined  position  in  the  protein  to  be  immobilized,  which 
can  later  react  in  a  chemoselective  manner  with  a 
complementary  group  previously  introduced  into  the 
surface  (Fig.  3,  see  also  references  (4,27,29,69,78)  for  recent 
reviews). 

Surface  Modification 

The  most  common  solid  supports  employed  for  the  immo¬ 
bilization  of  proteins  in  micro-  and  nano-biotechnology  and 
biomedical  applications  involve  the  use  of  metals  and  silicon- 
and  semiconductor-based  substrates.  Trialkoxysilanes  such  as 
3-aminopropyl-trialkoxysilane  (APS)  or  3-mercaptopropyl- 
trialkoxysilane  are  typically  employed  for  the  chemical 
modification  of  silicon-based  substrates  for  the  introduction 
of  amino  (— NH2)  and  thiol  (— SH)  groups,  respectively.  These 
chemical  groups  can  then  be  modified  by  the  introduction 
of  appropriate  linkers  allowing  the  chemoselective  attach¬ 
ment  of  proteins.  Long  chain  alkyl-trichlorosilanes  are  more 
reactive  towards  the  silanol  group  than  trialkoxysilanes  and 
have  also  been  employed  for  the  chemical  modification  of 
silicon-based  substrates.  The  higher  reactivity  of  long  alkyl- 
trichlorosilanes  is  due  to  the  self-assembling  properties  of  the 
long  aliphatic  chains,  which  result  in  the  formation  of  highly 
ordered  and  densely  packed  monolayers  with  solid-state-like 
properties  (79,80). 

Compounds  containing  the  thiol  or  selenol  (— SeH) 
groups  can  be  also  used  to  modify  substrates  based  on 
transition  metals,  mostly  gold  and  silver  (80,81),  or 
semiconductor  materials  (48).  The  chemical  derivatization 
of  gold  surfaces  using  alkanethiols  is  by  far  one  of  the  most 
commonly  employed  (81,82).  Our  group  has  developed 
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Fig.  3  Site-specific  and  covalent 
immobilization  of  a  functional 
protein  onto  a  chemically  modi¬ 
fied  surface  using  a  chemoselec- 
tive  ligation  reaction. 
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several  synthetic  schemes  for  the  efficient  preparation  of 
modified  alkanethiols  (83,84)  that  were  used  for  the 
selective  immobilization  of  functional  proteins  onto  gold 
and  glass  surfaces  (83—86). 

The  use  of  organic  polymeric  materials,  such  as  poly- 
dimethylsiloxane  (PDMS),  poly-methylmethacrylate 
(PMMA)  and  polycarbonate  (PC),  has  also  been  explored 
as  a  potential  alternative  to  inorganic  solid  supports  for  the 
production  of  protein  microarrays  (87,88).  The  use  of  these 
materials  also  requires  the  introduction  of  suitable  reacting 
groups  for  the  site-specific  immobilization  of  proteins. 
Common  techniques  usually  employed  for  this  task  involve 
the  use  of  plasma  oxidation  followed  by  treatment  with 
appropriate  organosilanes  for  the  functionalization  of 
PDMS  (89),  treatment  of  PMMA  with  1 ,6-hexanediamine 
for  the  introduction  of  reactive  amino  groups  (90),  or  using 
sulfonation  reactions  on  PC  to  provide  sulfated-coated 
surfaces  (29). 

Protein  Immobilization  Using  Expressed  Protein 
Ligation 

The  use  of  Expressed  Protein  Ligation  (EPL)  for  the  site- 
specific  immobilization  of  biologically  active  proteins  onto 
solid  supports  has  been  pioneered  by  our  group  (84).  This 
approach  relies  on  the  chemoselective  reaction  of  recombi- 
nantly  produced  protein  a-thioesters  with  surfaces  contain¬ 
ing  N-terminal  Cys  residues.  C-terminal  a-thioester 
proteins  can  be  readily  expressed  in  Escherichia  coli,  using 
commercially  available  intein  expression  systems  (91).  This 
ligation  reaction  is  exquisitely  chemoselective  under 
physiological-like  conditions  and  results  in  the  site-specific 
immobilization  of  the  protein  through  its  C-terminus.  We 
have  successfully  used  this  approach  for  the  production  of 
protein  arrays  containing  several  biologically  active  proteins 


onto  Cys-coated  glass  slides  (84).  Typically,  the  immobili¬ 
zation  reaction  is  performed  at  room  temperature  for  18  h 
and  requires  a  minimal  protein  concentration  in  the  low 
pM  range  for  acceptable  levels  of  immobilization  (84).  Yao 
and  co-workers  have  also  reported  a  similar  approach  for 
the  selective  immobilization  of  N-terminal  Cys-containing 
polypeptides  (52)  and  proteins  (92)  onto  solid  supports 
derivatized  with  an  a-thioester  group. 

Schneider-Mergener  and  co-workers  have  recently 
combined  SPOT  synthesis  (93)  and  a  thioester  ligation  for 
the  creation  of  arrays  containing  more  than  10,000  variants 
of  WW  protein  domains  (94).  Using  22  different  peptide 
ligands  to  probe  the  WW  domain  arrays,  the  authors  were 
able  to  monitor  more  than  250,000  binding  experiments 
(94). 

Protein  Immobilization  Using  the  Staudinger  Ligation 
Reaction 

A  modified  version  of  the  Staudinger  ligation  reaction  has 
also  been  employed  for  the  chemoselective  immobilization 
of  azido-containing  proteins  onto  solid  supports  derivatized 
with  a  suitable  phosphine  (71,75,95—97).  The  azido 
function  can  be  readily  incorporated  into  recombinant 
proteins  using  E.  coli  methionine  auxotroph  strains  (98,99). 
A  reactive  arylphosphine  derivative  can  be  easily  intro¬ 
duced  onto  carboxylic-  or  amine-containing  surfaces 
(63,97).  It  should  be  noted  that  when  the  protein  to  be 
immobilized  has  multiple  methionine  residues  this  type  of 
immobilization  is  not  site-specific.  This  limitation  can  be 
overcome,  however,  by  using  in  vitro  EPL  for  the  site- 
specific  introduction  of  an  azido  group  at  the  C-terminus  of 
recombinant  proteins  (97).  This  can  also  be  accomplished 
by  reacting  the  corresponding  protein  C-terminal  a- 
thioesters  with  functional  hydrazines  containing  the  azido 
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group  for  the  site-specific  introduction  of  this  chemical 
group  at  the  C-terminus  of  recombinant  proteins  (100). 

Protein  Immobilization  Using  “Click”  Chemistry 

The  site-specific  immobilization  of  azido-  or  alkyne- 
containing  proteins  onto  alkyne-  or  azido-coated  surfaces 
was  recently  accomplished  by  using  the  Cu(I)-catalyzed 
Huisgen  1,3-dipolar  azide-alkyne  cycloaddition,  also  known 
as  “click”  chemistry  (101-103). 

This  is  a  very  mild  reaction  that  usually  requires  only  the 
presence  of  Cu(I)  as  catalyst  and  is  typically  performed 
under  physiological  conditions.  Under  these  conditions,  the 
cycloaddition  reaction  is  exquisitely  regiospecific,  affording 
only  the  1 ,4-disubstitued  tetrazole.  The  catalyst  Cu(I)  is 
usually  generated  in  situ  by  reduction  of  Cu(II)  using 
reducing  agents  such  as  tris-[2-carboxyethyl] -phosphine 
hydrochloride  (TCEP*HC1)  or  ascorbic  acid  (77). 

Site-specific  incorporation  of  an  alkyne  group  at  the  C- 
terminus  of  recombinant  proteins  can  be  also  accomplished 
by  using  in  vitro  EPL  (101)  or  nucleophilic  cleavage  of  intein 
fusion  proteins  with  derivatized  hydrazines  (100).  The 
alkyne  function  has  also  been  introduced  cherno- 
enzymatically  into  recombinant  proteins  by  using  protein 
farnesyltransferases  (PFTase)  (102,103).  This  approach 
allows  the  selective  S-alkylation  of  the  Cys  residue  located 
in  C-terminal  Cys-Aaa-Aaa-Xxx  motifs  (where  Xxx  =  Ala, 
Ser)  by  farnesyl  diphosphate  analogs  containing  the  alkyne 
function. 

Taki  and  co-workers  have  also  accomplished  the 
introduction  of  the  azido  function  onto  the  N-termini  of 
proteins  by  using  the  enzyme  L/F-transferase  (104),  which 
is  known  to  catalyze  the  transfer  of  hydrophobic  amino 
acids  from  an  aminoacyl-tRNA  to  the  N-terminus  (105). 
This  modification,  called  NEXT-A  (N-terminal  extension 
of  protein  by  transferase  and  amino-acyl  transferase),  can 
be  accomplished  in  one  pot  and  can  also  work  in  the 
presence  of  other  proteins  or  even  in  crude  protein 
mixtures  (106,107).  The  authors  used  this  method  to 
functionalize  the  N-terminus  of  lectin  EW29Ch  with  p- 
azido-phenylalanine,  which  was  then  immobilized  onto  a 
solid  support  coated  with  4-dibenzocyclooctynol  (DIB) 
through  a  copper-free  “click”  chemistry  ligation  (108—1 10). 

Waldmann  and  co-workers  have  also  developed  the 
“click  sulfonamide  reaction”  (CSR)  between  sulfonyl  azides 
and  alkynes  to  immobilize  proteins  and  other  types  of 
biomolecules  onto  solid  supports  (111).  Using  this  approach 
the  authors  were  able  to  immobilize  a  C-terminal  alkyne  - 
modified  Ras-binding  domain  (RBD)  of  cRafl  onto  a 
sulfonyl  azide  modified  surface.  The  resulting  immobilized 
protein  was  biologically  active  and  able  to  selectively  bind 
to  GppNHp-bound  Ras  but  not  to  inactive  GDP-bound 
Ras  (111). 


In  principle,  “click”  chemistry  can  be  used  for  the 
chemoselective  immobilization  of  alkyne-  or  azido- 
containing  recombinant  proteins  onto  azido-  or  alkyne- 
coated  surfaces,  respectively.  However,  it  has  been  recently 
reported  that  the  immobilization  of  alkyne-modified  proteins 
onto  azide-coated  surfaces  proceeds  more  efficiently  (101). 
This  effect  could  be  attributed  to  the  fact  that  the  alkyne 
function  coordinates  Cu(I)  in  solution  more  efficiently  than 
the  azido  group,  which  could  improve  the  immobilization 
reaction  (101).  As  for  the  other  ligation  reactions  mentioned 
above,  the  minimal  concentration  of  protein  required  for 
acceptable  levels  of  immobilization  using  this  type  of  ligation 
is  typically  found  in  the  low  ]iM  range  (101,102). 

Protein  Immobilization  Using  Active  Site-Directed 
Capture  Ligands 

The  efficiency  of  the  different  ligation  reactions  described 
so  far  for  the  site-specific  immobilization  of  proteins  onto 
solid  supports  depends  strongly  on  the  protein  concentra¬ 
tion  in  order  to  reach  acceptable  levels  of  immobilization 
(84,101,102).  This  intrinsic  limitation  could  be  in  principle 
minimized  by  introducing  two  complementary  interacting 
moieties  on  the  protein  and  the  surface,  thus  allowing  the 
formation  of  a  transient  and  specific  intermolecular 
complex.  The  formation  of  this  complex  should  be  able  to 
bring  both  reactive  groups  in  close  proximity,  which  would 
facilitate  the  efficiency  of  the  ligation  reaction  (see  Fig.  4). 
In  this  case,  the  efficiency  of  the  reaction  should  not  be 
dictated  only  by  the  concentration  of  the  protein  to  be 
immobilized  but  rather  by  the  affinity  constant  between  the 
two  interacting  complementary  moieties. 

Mrksich  and  co-workers  have  used  this  approach  for  the 
selective  immobilization  of  cutinase  fusion  proteins  onto 
surfaces  coated  with  chlorophosphonate  ligands  (112) 
(Fig.  5).  Cutinase  is  a  22  kDa  serine  esterase,  which  can 
selectively  react  with  chlorophosphonate  ligands  (113). 
These  ligands  bind  with  high  affinity  to  the  active  site  of 
the  enzyme  by  mimicking  the  tetrahedral  transition  state 
stabilized  by  the  esterase  during  the  hydrolysis  of  the  ester 
function.  Once  the  complex  is  formed,  the  side-chain  of  the 
catalytic  serine  residue  in  the  esterase  active  site  reacts 
covalently  with  the  chlorophosphonate  group  to  form  a 
relatively  stable  phosphate  bond  (Fig.  5).  This  approach  was 
used  for  the  immobilization  of  calmodulin  (112)  and  for  the 
preparation  of  antibody  arrays  (114)  onto  gold-coated  self- 
assembled  monolayers  derivatized  with  a  chlorophospho¬ 
nate  capture  ligand. 

Johnsson  and  co-workers  have  also  used  a  similar  approach 
for  the  site-specific  immobilization  of  proteins  but  using 
human  Ob-alkylguanine-DNA  alkyltransferase  (AGT)  as  a 
protein  capture  reagent  (115).  These  types  of  enzymes  can 
accept  a  benzyl  group  from  06-benzylguanine  (BG)  deriva- 
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Fig.  4  Principle  for  site-specific 
protein  immobilization  using  an 
active  site-directed  capture  ligand 
approach. 


Binding 

equilibrium 


tives,  thus  allowing  the  site-specific  immobilization  of  AGT- 
fusion  proteins  onto  06-benzylguanine-coated  slides  (116). 

Protein  Immobilization  by  Protein  Trans-splicing 

The  main  limitation  of  the  site-specific  capture  methods 
described  above  is  that  they  rely  on  the  use  of  enzymes  as 
capture  reagents,  which  remain  attached  to  the  surface 
once  the  immobilization  step  is  complete.  The  production 
of  protein  arrays  containing  these  large  linkers  could  give 


rise  to  non-specific  interactions,  especially  in  applications 
involving  the  analysis  of  complex  samples  (1 1,87). 

Our  group  has  recently  developed  a  new  traceless  capture 
ligand  approach  for  the  site-specific  attachment  of  proteins  to 
surfaces  based  on  the  protein  trans-splicing  process  (85) 
(Fig.  6).  In  protein  trans-splicing,  the  intein  self-processing 
domain  is  split  in  two  fragments,  which  are  referred  as  N- 
intein  and  C-intein  (117,118).  In  this  approach  the  N-intein 
fragment  is  fused  to  the  C-terminus  of  the  protein  to  be 
immobilized,  and  the  C-intein  fragment  is  immobilized  onto 


attached  to  surface 


Fig.  5  A  Site-specific  immobilization  of  cutinase-fusion  proteins  using  an  active  site-directed  capture  ligand.  B  Structure  of  F.  solani  cutinase  enzyme  free 
and  bound  to  the  inhibitor  n-undecyl-O-methyl  phosphonate  chloride.  The  inhibitor  is  covalently  bound  through  the  side-chain  hydroxyl  group  of  the 
Ser120  residue,  which  is  located  at  the  active  site  of  the  enzyme  (I  13). 
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Fig.  6  Site-specific  immobiliza¬ 
tion  of  proteins  onto  solid 
supports  through  protein  trans¬ 
splicing  (85).  Maltose  binding 
protein  (MPB)  was  directly 
immobilized  from  (a)  soluble 
cellular  fraction  of  £  coli  cells 
over-expressing  MBP-lNl  and  (b) 
MBP-IN  expressed  in  vitro  using  an 
in  vitro  trascriptionAraslation  ex¬ 
pression  system.  MBP  was 
detected  using  a  fluorescent- 
labeled  specific  antibody. 
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the  solid  support.  When  both  intein  fragments  interact,  they 
bind  to  each  other  with  high  affinity  (Aj~200  nM  for  the  Ssp 
DnaE  split-intein  (85)),  forming  an  active  intein  domain  that 
can  give  rise  to  protein  splicing  in  trans.  This  results  in  the 
immobilization  of  the  protein  of  interest  to  the  solid  support 
at  the  same  time  that  the  split  intein  fragments  are  spliced 
out  into  solution  (see  Fig.  6).  We  have  recently  used  this 
approach  for  the  production  of  arrays  containing  several 
biologically  active  proteins  onto  chemically  modified  glass 
slides  (85).  The  immobilization  of  proteins  using  trans¬ 
splicing  is  highly  specific  and  efficient.  For  example,  protein 
immobilization  can  be  readily  accomplished  at  concentra¬ 
tions  in  the  low  nM  range  (85).  Importantly,  the  high 
specificity  of  protein  trans-splicing  allows  the  direct  immobi¬ 
lization  of  proteins  from  complex  mixtures,  thus  eliminating 
the  need  for  purification  and/or  reconcentration  of  the 
proteins  prior  to  the  immobilization  step.  Furthermore, 
protein  trans-splicing  provides  a  completely  traceless  method 
of  protein  immobilization,  since  both  intein  fragments  are 


spliced  out  into  solution  once  the  immobilization  step  is 
completed.  Finally,  protein  trans-splicing  was  shown  to  be 
fully  compatible  with  cell-free  protein  expression  systems, 
which  should  facilitate  high  throughput  production  of 
protein  arrays  (5,85).  More  recently,  we  have  also  shown 
that  the  trans-splicing  activity  of  the  naturally  occurring  Ssp 
DnaE  split-intein  can  be  photomodulated  by  introducing 
photolabile  backbone  protecting  groups  on  the  C-intein 
polypeptide  (119).  This  opens  the  intriguing  possibility  for 
light-activated  immobilization  of  proteins  onto  solid 
supports,  which  should  allow  rapid  production  of  protein 
arrays  by  using  available  photolithographic  techniques  (1 20). 

PROTEIN  ARRAY  TECHNOLOGIES  BASED 
ON  CELL-FREE  EXPRESSION  SYSTEMS 

Protein  arrays  have  been  traditionally  produced  by  cellular 
expression,  purification  and  immobilization  of  individual 
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proteins  onto  appropriate  solid  supports.  The  production  of  a 
large  number  of  proteins  using  conventional  expression 
systems,  based  on  bacterial  or  eukaryotic  cells,  is  usually  a  very 
time-consuming  process  that  requires  large  amounts  of 
manpower.  Moreover,  the  presence  of  disulfide  bonds,  special 
requirements  for  folding  and  post-translational  modifications  in 
some  proteins,  especially  those  of  human  origin,  may  require 
more  specialized  expression  systems  such  as  mammalian  cells  or 
baculovirus.  The  stability  of  folded  proteins  in  an  immobilized 
state  over  long  periods  of  storage  is  also  another  potential  issue 
when  working  with  protein  microarrays,  especially  if  we 
consider  the  highly  heterogeneous  nature  of  proteins  in  regards 
to  their  physicochemical  properties  and  stability  characteristics. 

The  use  of  cell-free  expression  systems  has  been 
proposed  as  a  potential  solution  to  circumvent  some  of 
these  issues.  Because  DNA  arrays  can,  in  principle,  be 
readily  synthesized  and  are  physically  homogeneous  and 
stable,  the  issues  associated  with  availability  and  stability 
should  not  apply  in  this  case.  Hence,  cell-free  expression 
systems  have  the  potential  to  allow  the  immobilization  of 
proteins  at  the  same  time  they  are  produced  by  converting 
DNA  arrays  into  protein  arrays  on  demand  (7,121). 

Cell-Free  Protein  Expression  Systems 

Cell-free  expression  systems  make  use  of  cell  extracts  that 
contain  all  of  the  key  molecular  components  for  carrying  out 
transcription  and  translation  in  vitro.  Typically,  these  extracts 
can  be  purified  from  cell  lysates  of  different  types  of  cells. 
The  most  commonly  used  are  obtained  from  E.  coli,  rabbit 
reticulocyte  and  wheat  germ,  although  more  specialized  cell 
extracts  from  hyperthermophiles,  hybridomas,  insect,  and 
human  cells  can  also  be  employed  (7).  This  large  variety  of 
available  cell-free  expression  systems  ensures  that  proteins 
can  be  expressed  under  different  conditions  (122).  Cell-free 
systems  have  also  been  used  for  the  introduction  of  different 
biophysical  probes  during  translation  for  protein  detection 
and/or  immobilization  (123—125). 

An  important  aspect  to  consider  when  preparing  in  situ 
protein  arrays  is  the  level  of  protein  expression.  While 
many  proteins  can  be  readily  expressed,  others  may  require 
modifications  in  the  expression  protocol  or  to  the  protein 
construct,  for  example  by  fusing  them  to  a  well-expressed 
fusion  protein.  He  and  co-workers  have  shown  that  using 
fusion  protein  constructs  containing  the  constant  domain  of 
immunoglobulin  K  light  chain  can  significantly  improve  the 
expression  levels  of  many  proteins  in  E.  coli- based  cell-free 
expression  systems  (126). 

Protein  In  Situ  Array  (PISA) 

In  tins  method,  proteins  are  produced  directly  from  DNA 
in  solution  and  then  immobilized  as  they  are  produced  onto 


the  surface  through  a  recognition  tag  sequence  (Fig.  7A) 
(127).  In  general,  the  DNA  constructs  encoding  the  proteins 
can  be  generated  by  PCR  using  designed  specific  primers 
for  the  protein  of  interest,  although  expression  plasmids  can 
also  be  used.  The  DNA  constructs  are  also  designed  with 
strong  promoters,  such  as  T7,  and  regulatory  sequences 
required  for  in  vitro  initiation  of  transcription/translation. 
An  affinity  tag  sequence  is  also  usually  encoded  into  the  N- 
or  C-terminus  of  the  protein  to  facilitate  its  immobilization 
after  the  translation  step  (Fig.  7A). 

In  this  approach,  all  the  proteins  are  expressed  in 
parallel  using  the  appropriate  in  vitro  transcription/transla- 
tion  systems.  The  protein  translation  reaction  is  carried  out 
on  the  surface,  which  is  precoated  with  a  capture  reagent 
able  to  specifically  bind  to  the  affinity  tag  and  immobilize 
the  proteins.  This  is  typically  accomplished  by  using  His- 
tagged  proteins  and  Ni“+-NTA  coated  surfaces,  although 
other  affinity  tag/ capture  reagent  combinations  can  also  be 
used.  Once  the  protein  is  translated  and  specifically 
immobilized  onto  the  surface,  any  unbound  material  can 
be  washed  away. 

The  PISA  method  was  originally  demonstrated  using  a 
small  set  of  proteins,  which  included  several  antibody 
fragments  and  the  protein  luciferase.  These  proteins  were 
immobilized  onto  microliter  wells  and  magnetic  beads 
(127).  In  this  work,  PISA  was  used  in  a  macro  format  in 
which  =25  |iL  of  cell-free  expression  reaction  was  used  for 
the  immobilization  of  individual  proteins.  More  recently, 
PISA  has  also  been  miniaturized  (using  =40  nL)  and 
adapted  for  the  direct  production  of  microarrays  onto  glass 
slides.  In  this  method,  the  transcrip tion/translation  reaction 
is  performed  for  2  h  at  30°C  before  spotting  (7). 

Hoheisel  and  co-workers  have  further  developed  the 
miniaturization  of  PISA  using  an  on-chip  system  based  on  a 
multiple  spotting  technique  (MIST)  (128).  In  this  approach, 
the  DNA  template  is  first  spotted  (=350  pL)  on  the  surface 
followed  by  the  in  vitro  transcription/translation  mixture  on 
the  same  spot.  The  authors  used  His-tagged  GFP  as  a 
model  protein  that  was  immobilized  onto  Ni~  -NTA- 
coated  glass  slides.  It  was  estimated  that  with  unpurified 
PCR  products,  as  little  as  35  fg  (=22,500  molecules)  of 
DNA  was  sufficient  for  the  detection  of  GFP  expression  in 
sub-nL  volumes  (128).  The  same  authors  also  adapted  the 
system  for  the  high  throughput  expression  of  libraries  by 
designing  a  single  specific  primer  pair  for  the  introduction 
of  the  required  F7  promoter  and  terminator,  and  demon¬ 
strated  the  in  situ  expression  using  384  randomly  chosen 
clones  from  a  human  fetal  brain  library  (128).  In  principle, 
the  optimized  and  miniaturized  version  of  PISA  should  be 
able  to  produce  high-density  protein  microarrays  contain¬ 
ing  as  much  as  13,000  spots  per  slide  using  a  variety  of 
different  genomic  sources  in  a  relatively  uncomplicated 
fashion. 
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Fig.  7  In  situ  methods  for  protein 
arraying  by  PISA  (A),  NAPPA 
(B)  and  puromycin-capture  from 
RNA  arrays  (C). 
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Nucleic  Acid  Programmable  Protein  Array  (NAPPA) 

NAPPA  is  another  approach  that  allows  the  on-chip 
transformation  of  DNA  arrays  into  protein  arrays 
(Fig.  7B).  NAPPA  was  initially  developed  by  LaBaer  and 
co-workers,  and  uses  transcription  and  translation  from  an 
immobilized  DNA  template  (67,129),  as  opposed  to  PISA, 
where  the  DNA  template  is  kept  in  solution.  In  NAPPA,  the 
expression  plasmids  encoding  the  proteins  as  GST  fusions 
are  biotinylated  and  immobilized  onto  a  glass  slide 
previously  coated  with  avidin  and  an  anti-GST  antibody, 
which  acts  as  the  protein  capture  reagent.  This  plasmid 
array  is  then  used  for  in  situ  expression  of  the  proteins  using 
rabbit  reticulocyte  lysate  or  a  similar  cell-free  expression 
system.  Once  the  proteins  are  translated,  they  are  imme¬ 
diately  captured  by  the  immobilized  antibody  within  each 
spot.  This  process  generates  a  protein  array  in  which  every 
protein  is  co-localized  with  the  corresponding  expression 
plasmid.  In  general,  NAPPA  provides  good  quality  protein 
spots  with  limited  lateral  spreading,  although  some  varia¬ 
tion  can  be  observed  in  the  quality  of  the  arrays  generated 
by  this  approach. 

The  first  demonstration  of  the  NAPPA  approach  was 
carried  out  by  the  immobilization  of  8  different  cell  cycle 
proteins,  which  were  immobilized  at  a  density  of  5 1 2  spots 
per  slide  (67).  It  was  estimated  that  =10  fmol  of  protein 
were  captured  on  average  per  spot,  ranging  from  4  to 
29  fmol  for  the  different  proteins,  which  was  sufficient  for 
functional  studies.  The  authors  used  this  protein  array  to 


map  and  identify  new  interactions  between  29  human 
proteins  involved  in  initiation  of  DNA  replication.  These 
data  were  used  to  establish  the  regulation  of  Cdtl  binding 
to  select  replication  proteins  and  map  its  geminin-binding 
domain  (67). 

As  with  PISA,  NAPPA  allows  the  protein  array  to  be 
generated  in  situ,  thus  removing  any  concerns  about  protein 
stability  during  storage.  However,  it  requires  the  cloning  of 
the  genes  of  interest  and  biotinylation  of  the  resulting 
expression  plasmids  to  facilitate  their  immobilization  onto 
the  chip  (Fig.  7B).  Furthermore,  the  technology  does  not 
generate  a  pure  protein  microarray,  but  rather  a  mixed 
array  in  which  the  different  GST  fusion  proteins  are  co¬ 
localized  with  their  corresponding  expression  plasmids, 
avidin  and  the  capture  antibody. 

In  Situ  Puromycin-Capture  from  mRNA  Arrays 

Tao  and  Zhu  have  ingeniously  adapted  the  mRNA  display 
technology  for  the  production  protein  of  microarrays  by 
capturing  the  nascent  polypeptides  through  puromycin 
(Fig.  7C)  (130).  In  this  approach,  the  PCR-amplified  DNA 
construct  is  transcribed  into  mRNA  in  vitro,  and  the  3'-end 
of  the  mRNA  is  hybridized  with  a  single-stranded  DNA 
oligonucleotide  modified  with  biotin  and  puromycin.  These 
modified  RNAs  are  then  arrayed  on  a  streptavidin-coated 
glass  slide  and  allowed  to  react  with  a  cell-free  lysate  for  in 
vitro  translation.  During  the  translation  step,  the  ribosome 
stalls  when  it  reaches  the  RNA/DNA  hybrid  section  of  the 
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molecule,  and  the  DNA  is  then  cross-linked  to  the  nascent 
polypeptide  through  the  puromycin  moiety.  Once  the 
translation  reaction  is  finished,  the  mRNA  is  digested  with 
RNase,  leaving  a  protein  array  immobilized  through  the  C- 
termini  to  the  DNA  linker,  which  is  immobilized  through  a 
biotin/streptavidin  interaction  to  the  surface.  This  technol¬ 
ogy  was  first  exemplified  by  the  immobilization  of  GST, 
two  kinases,  and  two  transcription  factors  (130).  The 
transcription  factors  retained  the  ability  to  specifically  bind 
DNA  on  the  chip.  This  approach  provides  well-defined 
non-diffused  protein  spots  as  a  result  of  the  precise  co- 
localization  of  the  mRNA  with  puromycin  and  the  1:1 
stoichiometry  of  mRNA  versus  protein.  However,  this  method 
requires  extra  manipulations  involving  the  reverse  transcrip¬ 
tion  and  modification  of  the  RNA  before  the  spotting 
process,  which  may  limit  its  practical  use  for  the  creation 
of  large  protein  microarrays.  Furthermore,  the  amount  of 
protein  produced  is  proportional  to  the  amount  of  mRNA 
spotted,  since  there  is  no  enzymatic  amplification  involved  as 
in  the  PISA  and  NAPPA  approaches. 

DETECTION  METHODS 

In  order  to  analyze,  identify  and  quantify  the  proteins  or 
any  other  type  of  biomolecules  captured  by  the  protein 
microarray,  it  is  necessary  to  have  detection  methods  that 
can  provide  high  throughput  analysis,  high  signal-to-noise 
ratio,  good  resolution,  high  dynamic  range  and  reproduc¬ 
ible  results,  with  relatively  low  instrumentation  costs.  Most 
of  the  methods  available  for  this  task  can  be  classified  as 
label-dependent  and  label-free  detection  methods  (see 
references  (1,131)  for  recent  reviews). 

Label-Dependent  Methods  of  Detection 

Fluorescence-based  detection  is  probably  the  most  com¬ 
monly  used  method  in  protein  microarrays.  This  is 
mainly  due  to  its  simplicity,  relatively  high  sensitivity 
and  compatibility  with  already  available  DNA-array 
scanners.  Protein-detecting  microarrays  usually  employ 
a  sandwich  assay  fluorescence-based  detection  system  in 
which  captured  proteins  are  detected  by  a  secondary 
fluorescent-labeled  antibody  (Fig.  1).  This  assay  provides  a 
higher  specificity  than  the  immunoassay  based  on  a  single 
antibody,  since  it  reduces  potential  cross-reactivity  issues. 
The  sensitivity  of  fluorescence  detection  can  also  be 
improved  by  using  the  rolling  circle  amplification  (RCA) 
method,  which  has  been  successfully  applied  for  the 
profiling  of  different  cytokines  with  detection  limits  on 
the  £M  range  (132,133).  The  main  limitation  of  these 
methods,  however,  is  that  they  require  two  distinct  capture 
reagents  per  protein  to  be  analyzed,  which  means  that  if 


there  are  1,000  proteins  to  be  analyzed,  more  than  2,000 
antibodies  are  required. 

Specific  fluorescence  biosensing  probes  have  also  been 
used  for  the  quantitative  analysis  of  protein  phosphoryla¬ 
tion  and  protein  kinase  activity  on  functional  protein 
microarrays.  For  example,  the  Pro-Q  Diamond  dye  is  a 
novel  fluorescent  phosphorylation  sensor  that  allows  the 
detection  of  phosphoproteins  at  sub-picogram  levels  of 
sensitivity  (134).  Hamachi  and  co-workers  have  also 
developed  a  fluorescence-based  method  for  imaging  mono- 
phosphorylated  polypeptides  by  using  bis-(Zn"  -dipicolyl- 
amine)-based  artificial  sensors  (135).  Such  chemical 
approaches  do  not  require  the  use  of  anti-target  antibodies 
and  therefore  represent  a  good  approach  for  high  through¬ 
put  screening  of  protein  phosphorylation  and  kinase 
activity. 

The  use  of  fluorescent-labeled  substrates  immobilized 
onto  a  microarray  format  has  also  been  reported  to  study 
enzymatic  specificity  in  a  high  throughput  format.  Ellman 
and  co-workers  have  used  this  approach  to  determine  the 
P-site  substrate  specificity  of  several  serine  and  cysteine 
proteases  (136).  In  their  work,  the  fluorophore  7-amino-4- 
methy-coumarin  (AMC)  was  covalently  attached  to  a 
peptide  microarray  containing  different  amino  acids  at  the 
different  P-site  positions.  The  corresponding  sequence 
preferences  were  determined  by  analyzing  the  remaining 
fluorescence  on  the  chip  after  performing  the  proteolytic 
reaction.  Yao  and  co-workers  have  also  used  a  similar 
approach  for  screening  the  activities  of  different  types  of 
enzymes,  including  proteases,  epoxide  hydrolases,  and 
phosphatases  by  linking  the  substrate  to  the  surface  through 
a  fluorogenic  linker  (137).  The  same  authors  have  also 
developed  a  different  approach  for  the  activity-based 
detection  of  enzymes  using  a  microarray  format,  in  which 
the  samples  containing  the  enzymes  to  be  analyzed  are 
immobilized  onto  surfaces  and  then  visualized  with  fluo- 
rescently  labeled  mechanism-based  inhibitors  (138). 

The  protein  fingerprinting  (PFP)  technique  is  another 
fluorescence-based  detection  method  that  has  been 
employed  for  the  analysis  of  protein  microarrays.  This 
approach  makes  use  of  fluorophore-labeled  capture 
reagents  that  change  their  fluorescent  properties  once  they 
bind  to  the  target  protein;  thus,  by  comparing  patterns,  the 
proteins  of  interest  can  be  identified  and  at  the  same  time 
discriminate  any  signal  coming  from  non-specific  interac¬ 
tions  (139,140).  This  approach  does  not  use  high  affinity 
capture  reagents,  such  as  antibodies,  but  rather  uses 
relatively  weak  binders  such  as  synthetic  polypeptides. 

Other  label-dependent  methods  include  the  use  of 
radioactivity,  especially  for  enzymatic  reactions  such  as 
phosphorylation,  due  to  their  sensitivity  and  specificity.  For 
example,  Schreiber  and  co-workers  have  used  it  to  monitor 
kinase  activity  in  combination  with  radioisotope-labeled 
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ATP  (10).  Snyder  and  co-workers  have  employed  this 
approach  to  study  the  activities  and  substrate  preferences  of 
119  different  protein  kinases  (87).  The  use  of  radioisotope- 
labeled  molecules,  however,  may  raise  safety  concerns,  thus 
limiting  its  potential  for  high  throughput  analysis.  The  use 
of  chemilumiscence-based  detection  schemes  also  provides 
high  selectivity  and  sensitivity,  but  with  a  limited  resolution 
and  dynamic  range  (141). 

Label-Free  Detection 

The  use  of  fluorescence-based  detection  methods  is  by  far 
one  of  the  most  commonly  employed  approaches  for  the 
detection  of  proteins.  However,  there  are  several  limitations 
to  this  approach.  For  example,  labeling  of  proteins  in 
samples  or  specific  protein  capture  reagents  such  as 
antibodies  may  alter  the  surface  of  the  proteins  and 
therefore  their  binding  properties.  It  is  also  a  very  time- 
consuming  technique,  especially  when  a  multitude  of 
samples  need  to  be  labeled.  Another  potential  issue  is  the 
variability  in  labeling  efficiency  of  proteins  across  different 
samples.  This  is  a  critical  issue,  especially  when  non-specific 
labeling  techniques  are  employed,  since  small  variations  in 
the  temperature  and  reaction  duration,  for  example,  can 
seriously  influence  the  efficiency  of  protein  labeling. 

These  limitations  have  sparked  the  development  of  novel 
label-free  detection  schemes  involving  mass  spectrometry 
(MS)-  and  optical  spectroscopy-based  measurements 
(131,142). 

In  particular  MS-based  detection  has  already  been  used 
for  the  discovery  of  disease-associated  biomarkers  (143).  For 
example,  the  use  of  surface-enhanced  laser  desorption 
ionization  time-of-flight  (SELDI-TOF)  MS  allows  the 
detection  of  captured  proteins  without  the  need  for  labeling 
(144).  In  fact,  SELDI  has  been  widely  used  for  the  discovery 
and  detection  of  biomarkers  associated  to  several  types  of 
cancer  (145—150).  More  recently,  Becker  and  Engelhard 
have  also  used  matrix-assisted  laser  desorption/ionization 
time-of-flight  (MALDI-TOF)  for  the  direct  read-out  of 
protein/protein  interactions  using  protein-DNA  microar¬ 
rays  generated  by  DNA-directed  immobilization  (DDI)  (61) 
(see  reference  (62)  for  a  recent  review  in  this  field).  The 
authors  used  this  approach  for  the  rapid  detection  of 
activated  Ras  in  cell  lysates  from  several  cell  lines. 

Another  well-established  label-free  detection  method  is 
surface  plasmon  resonance  (SPR).  SPR  can  also  provide 
kinetic  information  on  binding  events.  In  this  approach,  the 
appropriate  capture  reagents  are  immobilized  onto  a  gold 
surface,  and  quantification  of  the  captured  proteins  is 
carried  out  by  measuring  the  change  in  the  reflection  angle 
of  light  after  hitting  the  gold  surface  (151).  For  example,  an 
SPR  imaging  method  was  recently  used  for  the  high 
throughput  screening  of  molecules  able  to  target  the 


interaction  between  the  retinoblastoma  tumor  suppressor 
RB  and  the  human  papillomavirus  (HPV)  E7  proteins 
(152).  The  E7  protein  is  produced  by  high-risk  human 
papillomavirus  (HPV)  and  induces  degradation  of  the 
retinoblastoma  tumor  suppressor  RB  through  a  direct 
interaction,  and  it  has  been  suggested  as  a  potential 
molecular  target  in  cancer  therapy.  In  this  work,  a 
glutathione-coated  SPR  chip  was  used  for  the  immobiliza¬ 
tion  of  the  E7  GST-fusion  protein,  which  was  then 
complexed  with  His-tagged  RB  protein  in  the  presence  of 
different  RB-binding  peptides  derived  from  a  motif  of  the 
E7  protein.  Some  of  these  peptides  were  shown  to 
antagonize  the  interaction  between  His-tagged  RB  and 
GST-E7  in  a  concentration-dependent  manner  (152). 

A  conventional  SPR  system,  however,  can  only  use  a 
single  channel  per  experiment.  The  recent  development  of 
SPR  microscopy  allows  the  analysis  of  hundreds  of 
biomolecular  interactions  simultaneously  in  large  protein 
microarrays  (>1,300  spots)  allowing  for  qualitative  screen¬ 
ing  and  quantitative  kinetics  experiments  in  a  high 
throughput  format  (153). 

The  anomalous  reflection  (AR)  technique  is  another 
spectroscopic  detection  scheme  that  has  been  suggested  as 
an  alternative  to  SPR.  AR  is  a  characteristic  property  of  gold 
that  causes  a  large  decrease  in  the  reflectivity  of  blue  or  purple 
light  (380  nm  <  X  <  480  nm)  on  a  gold  surface  upon 
adsorption  of  a  transparent  dielectric  layer  on  its  surface  (154). 
The  AR  technique  requires  relatively  less  complex  optics 
than  the  SPR  systems  and  has  the  potential  to  offer 
miniaturized  and  parallelized  measurements;  therefore,  it 
could  be  potentially  suitable  as  a  high-throughput  analytical 
platform.  This  approach  has  been  used  so  far  with  some 
success  for  analyzing  biotin/ avidin,  calmodulin/ synthetic  a- 
helical  peptides  and  T7-phage  displayed-proteins  and 
synthetic  peptide  interactions  (154,155).  At  this  point, 
however,  AR-based  detection  of  microarrays  needs  to  be 
further  developed  for  detection  of  multiplexed  protein- 
protein  interactions  beyond  the  proof-of-concept. 

APPLICATIONS 

Some  of  the  applications  of  protein  microarrays  have 
already  been  discussed  in  the  previous  sections.  The  most 
prominent  applications  include  high-throughput  proteo- 
mics,  biomarker  research  and  drug  discovery.  Several 
reviews  focusing  on  the  biomedical  applications  of  protein 
microarrays  have  been  published  recently  (3,8). 

Proteomics 

Functional  protein  microarrays  are  ideal  bioanalytical 
platforms  to  carry  out  high-throughput  proteomics. 
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Perhaps  the  most  advanced  example  of  this  application  to 
date  was  reported  by  MacBeath  and  co-workers  to  study 
the  phosphorylation  states  of  the  ErbB-receptor  kinase 
family  using  functional  protein  microarrays  (23,156).  The 
first  three  members  of  the  ErbB  family  of  receptor  tyrosine 
kinases,  ErbB  1-3,  are  involved  in  the  activation  of  a  wide 
variety  of  signaling  pathways  that  are  frequendy  misregu- 
lated  in  cancer.  Erb4,  on  the  other  hand,  is  not  involved  in 
tumorigenesis  and  has  been  shown  to  have  a  protective  role 
in  some  cancers.  In  order  to  study  in  more  detail  the  role  of 
this  receptor  tyrosine  kinase,  the  authors  first  used  tandem 
mass  spectrometry  to  identify  19  sites  of  tyrosine  phosphor¬ 
ylation  on  ErbB4.  These  phosphopeptides  were  then  used 
to  probe  a  funcdonal  protein  microarray  containing  96 
SH2  and  37  PTB  protein  domains  encoded  in  the  human 
genome.  The  obtained  data  was  used  to  build  a  quantita¬ 
tive  interaction  network  for  ErbB4  as  well  as  for  the 
identification  of  several  new  interactions  that  led  to  the 
finding  that  ErbB4  can  bind  and  activate  STAT1  (Fig.  2). 

Deng  and  co-workers  have  also  studied  protein-protein 
and  protein-DNA  interactions  on  a  global  scale  in  the  plant 
A.  thaliana  by  making  use  of  functional  microarrays  (157). 
The  authors  created  a  microarray  containing  up  to  802 
different  transcription  factors  from  A.  thaliana.  The  proteins 
were  expressed  using  a  yeast  expression  system  and  arrayed 
onto  FAST  glass  slides,  which  are  commercially  available 
slides  coated  with  a  nitrocellulose  membrane.  The  resulting 
microarray  was  probed  with  different  fluorescent-labeled 
oligonucleotides  containing  known  binding  sites  for  several 
transcription  factors  of  the  AP2/ERF  family.  Using  this 
approach  the  authors  were  able  to  confirm  known 
interactions  and  identify  48  new  ones.  These  included  four 
transcription  factors  that  were  able  to  bind  the  evening 
element  and  showed  an  expected  clock-regulated  gene 
expression  pattern,  thus  providing  a  basis  for  further 
functional  analysis  of  their  roles  in  circadian-regulated  gene 
expression  (157).  The  same  authors  also  used  this  micro¬ 
array  for  detecting  novel  protein-protein  interactions  and 
were  able  to  discover  four  novel  partners  that  interact  with 
transcription  factor  HY5  (157),  which  is  a  key  regulator  of 
photomorphogenesis  in  A.  thaliana  (157). 

It  should  be  highlighted,  however,  that  the  production  of 
whole-proteome  microarrays  is  technically  a  challenging 
task,  since  it  requires  the  isolation  of  a  large  number  of 
functional  proteins.  Furthermore,  the  analysis  of  whole- 
genome  microarrays  is  complicated  due  to  the  fact  that  they 
only  represent  particular  time  snapshots  of  the  proteome. 
Moreover,  proteins  not  only  differ  in  structure  and  function 
but  also  in  their  cellular  localization,  turnover  rates  and, 
more  importantly,  abundance.  However,  the  use  of  this 
technology  in  proteomic  research  still  allows  the  unprece¬ 
dented  ability  to  monitor  the  biomolecular  interactions  of 
thousands  of  samples  in  parallel,  which  by  far  outweighs  all 


the  difficulties  and  limitations  associated  with  their  use  and 
preparation. 

Biomarker  Research 

The  use  of  protein  microarrays  in  biomarker  research  has 
received  special  interest  in  the  areas  of  viral  diagnostics  and 
cancer  research.  For  example,  the  examination  and 
identification  of  particular  protein  profiles  in  early-stage 
cancers  could  lead  to  early  detection  of  tumors  and  the 
development  of  improved  therapies  for  cancer  patients. 
Antibody-based  microarrays  are  by  far  the  most  frequently 
used  in  biomarker  profiling  and  discovery  for  cancer 
research.  For  example,  Cordon-Cardo  and  co-workers 
have  used  an  antibody  array  composed  by  254  different 
antibodies  to  discriminate  bladder  cancer  patients  from 
control  patients  (40).  Snyder  and  co-workers  have  also  used 
protein  microarrays  to  profile  antibodies  against  human 
severe  acute  respiratory  syndrome  (SARS)  virus  and  related 
coronaviruses  (158).  In  their  study,  the  authors  used  82 
different  coronavirus  GST-fusion  proteins,  which  were 
expressed  in  yeast  and  arrayed  onto  FAST  glass  slides. 
These  arrays  were  used  to  profile  the  sera  of  two  patient 
groups  (more  than  600  samples  obtained  from  patients  in 
China  and  Canada)  with  ~90%  accuracy  (158).  Using  this 
approach,  it  was  possible  to  distinguish  patients  infected 
with  SARS  and  HCoV-229E,  two  different  human  coro¬ 
navirus.  These  results  were  further  validated  by  statistical 
methods  and  an  indirect  immuno-fluorescence  test,  and 
also  showed  that  the  sensitivity  provided  by  microarray 
profiling  was  similar  in  sensitivity  to  standard  indirect 
immuno-fluorescence  tests  but  was  more  specific  (158). 

LaBaer  and  co-workers  have  also  used  protein  microrrays 
generated  by  the  NAPPA  approach  for  tumor  antigen 
profiling  in  breast  cancer  (16).  In  this  work,  sera  from  breast 
and  ovarian  cancer  patients  were  tested  for  p53-specific 
antibodies  using  a  microarray  displaying  1,705  different  non- 
redundant  tumor  antigens.  These  results  were  also  corrob¬ 
orated  by  standard  indirect  immuno-blotting  techniques  (16). 

The  described  examples  are  just  a  sample  of  the  recent 
applications  of  protein  microarrays  in  biomarker  profiling 
and  discovery,  and  illustrate  the  great  potential  of  this 
technology  in  biomedical  applications. 

Drug  Discovery 

Protein  microarrays  have  also  been  used  in  drug  discovery 
for  target  identification  and  validation.  In  2004,  Schreiber 
and  co-workers  described  for  the  first  time  the  use  of  a 
protein  microarray  for  high-throughput  screening  of  small 
molecules  (159).  In  this  work,  the  authors  used  a  protein 
microarray  obtained  by  spotting  different  His-tagged  and 
GST-fusion  proteins  onto  chemically  modified  glass  slides. 
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These  arrays  were  used  to  screen  the  molecular  targets  of 
six  small-molecule  inhibitors  of  rapamycin  (SMIR)  that 
were  previously  identified  for  their  ability  to  rescue  growth 
of  yeast  cells  exposed  to  rapamycin  in  a  phenotype-based 
chemical  genetic  suppressor  assay.  To  facilitate  the  screen¬ 
ing  process,  the  SMIRs  were  conjugated  to  biotin,  and  the 
bound  SMIRs  were  then  detected  using  fluorescent-labeled 
streptavidin.  These  results  allowed  the  identification  of  a 
new,  unknown  member  of  the  target  of  rapamycin  (TOR) 
signaling  pathway  (159). 

Protein  microarrays  can  also  be  used  in  an  indirect 
fashion  for  screening  and  selecting  small  molecules  able  to 
antagonize  protein  interactions.  For  example,  antibody 
arrays  can  be  used  to  screen  and/or  profile  the  proteome 
for  changes  in  protein  expression  and/or  post-translational 
modifications,  such  as  phosphorylation,  induced  by  the 
presence  or  absence  of  a  particular  drug  candidate. 

Sokolov  and  Cadet  have  used  protein  microarrays  to 
study  the  correlation  between  the  levels  of  expression  of 
different  proteins  and  the  behavioral  phenotype  of  mice 
treated  with  methamphetamine  (METH)  (160).  METH 
abuse  has  been  shown  to  stimulate  aggressive  behaviors  in 
humans  and  in  other  animals.  The  authors  found  that  mice 
treated  chronically  with  METH  demonstrated  increased 
aggressiveness  and  hyper-locomotion  when  compared  to  an 
untreated  control  group.  In  this  work,  a  total  of  378 
different  monoclonal  antibodies  specific  for  proteins  related 
to  signal  transduction,  oncogene  products,  cell  cycle 
regulation,  cell  structure,  apoptosis,  and  neurobiology, 
among  others,  were  used  to  prepare  the  protein-detecting 
array  (160).  This  antibody  microarray  was  incubated  with 
proteins  extracted  from  the  brain  of  untreated  and  METH- 
treated  mice  and  labeled  with  fluorescent  dyes.  The  data 
revealed  a  decrease  in  the  natural  abundance  of  the 
proteins  Erk2  and  14-3-3e  in  the  striata  of  the  mice 
chronically  treated  with  METH.  Since  the  kinase  Erk2  is 
thought  to  be  the  principal  component  of  the  classical 
mithogen-activated  protein  (MAP)  kinase  pathway  and 
protein  14-3-3e  is  an  inhibitor  and  substrate  of  protein 
kinase  C,  the  reduction  in  these  two  proteins  suggests  that 
repeated  exposure  to  METH  might  alter  MAP  kinase- 
related  pathways  involved  in  behavioral  change  (160). 

These  examples  clearly  illustrate  the  potential  of  protein 
microarrays  for  drug  discovery  applications.  Despite  the 
numerous  advantages  in  the  preparation  and  analysis  of 
these  types  of  reagents,  their  use  in  drug  discovery  has  been 
limited  so  far. 

CONCLUDING  REMARKS 

The  aim  of  this  review  is  to  highlight  the  latest  develop¬ 
ments  in  the  preparation,  analysis  and  biotechnological 


applications  of  protein  microarrays.  Just  before  MacBeath 
and  Schreiber  reported  for  the  first  time  the  use  of  protein 
microarrays  in  2000  (10),  the  concept  of  using  protein 
microarray  technology  was  simply  regarded  as  a  dream.  A 
decade  later,  the  number  of  publications  on  protein 
microarray  technologies  has  increased  dramatically.  There 
are  approximately  32,000  publications  indexed  in  Pub  Med 
(http://www.ncbi.nlm.nih.gov/pubmed)  under  the  key¬ 
word  protein  microarrays.  We  have  seen  numerous  examples 
that  show  protein  microarrays  are  a  very  valuable  tool  for 
the  study  of  whole  proteomes  (11-13,18,23,24),  protein 
identification  and  profiling  for  early  diagnosis  of  diseases 
such  as  cancer  (16,40)  or  viral  infections  (158)  and  for  drug 
identification  and  validation  (159,160). 

Despite  the  large  number  of  successful  examples  in  the 
use  of  protein  microarrays  in  biomedical  and  biotechno¬ 
logical  applications  during  the  last  10  years,  there  are  still, 
however,  some  challenges  that  need  to  be  tackled.  For 
example,  most  of  the  methods  commonly  employed  for  the 
immobilization  of  proteins  onto  solid  supports  rely  on  non¬ 
site-specific  immobilization  techniques  (10,46,47,49,161). 
The  use  of  these  methods  usually  results  in  the  proteins 
being  displayed  in  random  orientations  on  the  surface, 
which  may  compromise  the  biological  activity  of  the 
immobilized  proteins  and/or  provide  false  results  (162). 
This  issue  has  been  addressed  over  the  last  few  years  by  the 
development  of  novel  site-specific  immobilization 
approaches  which  involve  the  use  of  chemoselective  ligation 
reactions  (52,84,92,97,101,102),  active  site-directed  capture 
ligands  (112,116,163-165)  and  protein  splicing  (68,85), 
among  others. 

The  expression  and  purification  of  thousands  of  proteins 
without  compromising  their  structural  and  biological 
activity  is  also  a  challenging  task.  The  use  of  cell-free 
expression  systems  in  combination  with  nucleic  acid  arrays, 
which  are  more  readily  available  and  easier  to  prepare,  has 
been  shown  to  give  good  results  to  produce  in  situ  protein 
arrays  from  DNA  (67,127,129)  and  RNA  arrays  (130).  The 
combination  of  these  approaches  with  site-specific  and 
traceless  methods  of  protein  immobilization  such  as  protein 
trans-splicing  (68,85)  shows  great  promise. 

The  introduction  of  label-free  detection  methods,  such 
as  surface  plasmon  resonance  and  mass  spectrometry,  also 
shows  great  promise  to  simplify  the  use  of  protein  micro¬ 
array  analysis,  since  labeling  of  the  interacting  partners  will 
no  longer  be  required. 

The  standardization  of  protein  microarray  production  is 
another  issue  that  needs  to  be  improved.  At  this  time,  most 
of  the  methods  used  by  the  scientific  community  for 
preparing  and  analyzing  protein  microarrays  are  not 
completely  standardized.  The  adoption  of  stringent  stand¬ 
ards  by  the  scientific  community  for  the  production  and 
analysis  of  these  valuable  reagents  should,  in  principle, 
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allow  the  generation  of  data  that  could  be  compared  and 
exchanged  across  dilferent  studies  and  different  research 
groups. 

None  of  these  challenges  is  impossible  to  achieve;  in  fact, 
as  we  have  seen  in  this  review,  much  more  progress  has 
been  made  over  the  last  decade  to  address  them.  At  this 
point,  we  strongly  believe  that  the  protein  nricroarray 
technology  is  on  the  brink  of  becoming  a  standard 
technique  in  research  in  the  same  way  as  DNA  microarray 
technology  is  used  today. 
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Abstract:  Cyclotides  are  a  growing  family  of  large  plant-derived  backbone-cyclized  polypeptides  (=30  amino  acids  long) 
that  share  a  disulfide-stabilized  core  characterized  by  an  unusual  knotted  structure.  Their  unique  circular  backbone  topol¬ 
ogy  and  knotted  arrangement  of  three  disulfide  bonds  makes  them  exceptionally  stable  to  thermal,  chemical,  and  enzy¬ 
matic  degradation  compared  to  other  peptides  of  similar  size.  Currently  more  than  100  sequences  of  different  cyclotides 
have  been  characterized  and  the  number  is  expected  to  increase  dramatically  in  the  coming  years.  Considering  their  stabil¬ 
ity,  biological  activities  and  ability  to  cross  the  cell  membrane,  cyclotides  can  be  exploited  to  develop  new  peptide-based 
drugs  with  high  potential  for  success.  The  cyclotide  scaffold  can  be  engineered  or  evolved  using  molecular  evolution  to 
inhibit  protein-protein  interactions  implicated  in  cancer  and  other  human  diseases,  or  design  new  antimicrobials.  The  pre¬ 
sent  review  reports  the  biological  diversity  and  therapeutic  potential  of  natural  and  engineered  cyclotides. 
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INTRODUCTION 

Head-to-tail  or  backbone-cyclized  peptides  are  present 
throughout  nature  from  bacteria  to  animals.  Bacteria  and 
fungi  express  numerous  backbone-cyclized  peptides  that  are 
currently  in  use  as  therapeutic  agents  [1].  For  example,  cy¬ 
closporin  A  is  a  fungal  peptide  with  potent  immunosuppres¬ 
sive  properties  and  is  used  to  treat  organ  transplant  patients 
[2].  Daptomycin,  is  a  13-amino  acid  cyclic  lipopeptide  with 
a  decanoyl  side  chain  isolated  from  Streptomyces  roseo- 
sporus  that  has  recently  been  approved  to  treat  infections 
against  Gram-positive  organisms,  including  multi-resistant 
strains  [3].  In  animals,  the  only  known  circular  peptides  are 
9-defensins,  which  are  expressed  in  blood  leukocytes  and 
bone  marrow  of  Old  World  monkeys  [4,  5].  0-Defensins  are 
antimicrobial  peptides  with  broad-spectrum  activities  against 
bacteria,  fungi,  and  viruses  [6-8].  Backbone  cyclized  pep¬ 
tides  have  been  also  found  in  plants  [9].  Sunflower  trypsin 
inhibitor  1  (SFTI-1)  for  example  is  a  bicyclic  14-residue 
long  peptide  found  in  sunflower  seeds.  SFTI-1  is  the  most 
potent  known  naturally  occurring  Bowman-Birk  trypsin  in¬ 
hibitor  [10].  Cyclotides,  a  novel  family  of  small  globular 
backbone-cyclized  micro-proteins  (=  30-residues  long),  are 
also  naturally  found  in  plants.  Here,  we  review  the  properties 
of  cyclotides  and  the  latest  developments  in  the  use  of  the 
cyclotide  scaffold  to  design  novel  peptide-based  therapeu¬ 
tics. 

CYCLOTIDES,  A  NOVEL  ULTRASTABLE  MO¬ 
LECULAR  SCAFFOLD 

Cyclotides  are  small  globular  micro-proteins  with  a 
unique  head-to-tail  cyclized  backbone,  which  is  stabilized  by 
three  disulfide  bonds  (Fig.  1).  Currently,  over  140  sequences 
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have  been  identified  in  the  plant  species  Rubiaceae, 
Violaceae,  and  Cucurbitaceae  [11].  Natural  cyclotides  have 
various  activities  including  insecticidal  [12,  13],  uterotonic 
[14],  anti-HIV  [15],  antimicrobial  [16,  17],  antitumor  [18], 
antihelminthic  [19,  20]  and  have  been  reported  to  cross  cell 
membranes  [21],  Their  insecticidal  and  antihelminthic  prop¬ 
erties  suggest  that  they  may  function  as  defense  molecules  in 
plants. 

Cyclotides  share  a  unique  head-to-tail  circular  knotted 
topology  of  three  disulfide  bridges,  with  one  disulfide  pene¬ 
trating  through  a  macrocycle  formed  by  the  two  other  disul¬ 
fides  and  inter-connecting  peptide  backbones,  forming  what 
is  called  a  cyclic  cystine  knot  (CCK)  motif  (Fig.  1).  The 
CCK  topology  is  responsible  for  the  high  stability  of  cy¬ 
clotides  to  enzymatic,  thermal  and  chemical  degradation 
[22].  The  cyclotide  family  is  divided  into  three  structurally 
distinct  families,  Mobius,  bracelet,  and  trypsin  inhibitor  sub¬ 
families  (Fig.  1).  Mobius  cyclotides  are  distinguished  from 
bracelet  cyclotides  by  the  presence  of  a  cis- Pro  residue  in 
loop  5.  Trypsin  inhibitor  cyclotides  have  very  different  pri¬ 
mary  structures  from  Mobius  and  bracelet  cyclotides,  but 
retain  the  conserved  cystine  knot  motif.  Trypsin  inhibitor 
cyclotides  share  a  high  sequence  homology  with  related  cys¬ 
tine-knot  trypsin  inhibitors  found  in  squash  such  as  EETI-II 
( Ecballium  elaterium  trypsin  inhibitor  II),  and  in  fact  can  be 
considered  cyclized  homo  logs  of  these  protease  inhibitors. 
Thus,  cyclotides  can  be  considered  natural  combinatorial 
peptide  libraries  structurally  constrained  by  the  cystine-knot 
scaffold  [23]  and  head-to-tail  cyclization  but  are  permissive 
of  hypermutation  of  essentially  all  residues  with  the  excep¬ 
tion  of  the  strictly  conserved  cysteines  that  comprise  the  knot 
[24-27], 

Hence,  cyclotides  form  a  unique  family  of  structurally- 
related  peptides  that  possess  remarkable  stability  due  to  the 
cystine  knot,  a  small  size  making  them  readily  accessible  to 
chemical  synthesis,  and  an  excellent  tolerance  to  sequence 
variations.  Moreover,  the  first  cyclotide  to  be  discovered, 
kalata  Bl,  is  an  orally  effective  uterotonic  [14].  Intriguingly, 
the  cyclotide  MCoTI-II  has  also  been  shown  to  cross  cell 
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Fig.  (1).  Primary  and  tertiary  structures  of  representative  cyclotides  from  the  bracelet  (kalata  Bl;  pdb  ID:  1NB1  [108]),  Mobius  (cy¬ 
cloviolacin  Ol,  pdb  ID:  1NBI  [108]),  and  trypsin  inhibitor  (MCoTI-II,  pdb  ID:  1IB9  [109])  subfamilies.  Conserved  cysteine  residues  and 
disulfide  bonds  are  shown  in  yellow.  The  blue  line  denotes  the  circular  backbone. 


membranes  through  macropinocytosis  [21].  We  have  also 
recently  found  that  MCoTI-I  has  similar  cellular-uptake 
properties  to  MCoTI-II  (unpublished  results).  All  of  these 
features  make  cyclotides  ideal  tools  for  the  development  of  a 
total  novel  class  of  peptide-based  therapeutics. 

CYCLOTIDE  BIOSYNTHESIS 

Cyclotides  are  ribosomally  synthesized  as  precursor  pro¬ 
teins,  which  consist  of  an  endoplasmic  reticulum  (ER)- 
targeting  sequence,  a  pro-region,  a  highly  conserved  N- 
terminal  repeat  (NTR)  region,  a  mature  cyclotide  domain, 
and  a  C-terminal  tail  (Fig.  2).  The  combined  NTR-cyclotide 
segment  may  contain  one  copy  of  the  cyclotide  sequence  or 
there  may  be  multiple  copies  of  the  same  or  different  cy¬ 
clotide  sequences  separated  by  additional  NTR  sequences. 
The  precursor  undergoes  post-translational  processing  to 
generate  a  circular  peptide  by  a  mechanism  that  has  not  been 
completely  elucidated  yet  [28,  29].  It  has  been  hypothesized 
that  a  conserved  Asn  (or  Asp)  at  the  C-terminal  cleavage  site 
may  be  a  recognition  site  by  asparaginyl  endoproteinase 
(AEP)  for  cyclization  of  the  peptide  in  plants  [28,  29].  AEP 
has  been  shown  to  be  involved  in  the  post-translational  proc¬ 
essing  of  concanavalin  A  from  the  jackbean  [30],  and  there¬ 
fore,  it  is  possible  this  enzyme  may  be  involved  in  the  cycli¬ 
zation  of  other  plant  peptides.  Studies  using  transgenic  plants 
that  express  a  cyclotide  precursor  have  demonstrated  the 
involvement  of  AEP  and  requirement  for  the  asparagine 


residue  in  the  cyclotide  sequence  [28,  29].  The  authors 
showed  that  inhibition  of  AEP  led  to  a  decrease  in  the 
amount  of  cyclic  product  and  an  accumulation  of  linear  pep¬ 
tides  that  were  transiently  expressed  [29].  In  addition,  a 
complementary  study  showed  that  mutation  of  the  aspar¬ 
agine  residue  or  truncation  of  the  conserved  C-terminal 
tripeptide  in  transgenic  plants  resulted  in  no  circular  peptide 
production  [28]. 

BIOLOGICAL  ACTIVITIES  OF  NATURALLY  OC¬ 
CURRING  CYCLOTIDES 

The  natural  function  of  cyclotides  appears  to  be  in  pro¬ 
tection  of  plants  against  insects  [12,  13],  nematodes  [20,  25], 
and  mollusks  [31],  Studies  have  demonstrated  that  cyclotides 
can  suppress  the  growth  and  development  of  insect  and 
nematode  larvae.  Various  other  studies  have  also  shown  cy¬ 
clotides  have  antimicrobial,  hemolytic,  uterotonic,  and  anti- 
HIV  activities.  Much  of  these  activities  likely  involve  inter¬ 
action  of  the  cyclotide  with  membranes,  although  the 
mechanism  of  action  is  not  totally  well  understood. 

The  first  cyclotide  discovered,  kalata  Bl,  was  identified 
in  the  plant  Oldenlandia  affinis  in  central  Africa  in  the 
1960’s  [32].  This  plant  was  used  by  the  natives  to  make  a  tea 
extract  that  was  used  to  accelerate  childbirth  during  labor 
[14,  33].  The  main  active  ingredient  in  the  tea  extract  was 
found  to  be  a  peptide  that  was  named  kalata  Bl,  after  the 
local  name  for  the  native  medicine.  The  uterotonic  properties 
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Fig.  (2).  Schematic  representation  of  the  putative  mechanism  of  protease-catalyzed  cyclization  for  cyclotides  [28.  29],  The  prototypic  linear 
precursor  protein  (top)  comprises  an  endoplasmic  reticulum  (ER)-targeting  sequence  (dark  blue),  a  pro-region  (purple),  an  N-teiminal  repeat 
region  (Ntr;  green),  a  cyclotide  domain  (light  grey)  and  a  C-terminal  tail  (red).  It  also  features  a  conserved  asparagine  (yelbw)  at  the  C- 
terminal  cleavage  point  of  the  cyclotide  domain.  The  precursor  is  processed  in  the  ER  and  vacuole,  disulfide  bonds  are  formed  (yellow),  and 
a  range  of  unidentified  proteases  (brown)  trim  the  precursor.  In  the  final  stage,  the  active-site  cysteine  of  an  AEP  (yellow)  displaces  the  C- 
terminal  tail  to  form  an  enzyme-acyl  intermediate  (boxed).  This  intermediate  is  then  attacked  by  the  cyclotide  N-terminal  glycine  to  form  the 
mature  cyclic  peptide.  Figure  taken  from  reference  [68], 


of  kalata  B1  indicated  that  the  peptide  was  orally  bioavail- 
able.  The  complete  sequence,  Cys-knotted  arrangement,  and 
cyclic  nature  of  kalata  B 1  was  determined  25  years  after  its 
original  discovery  [34].  Since  the  discovery  of  kalata  Bl, 
many  more  related  cyclotides  have  been  discovered  and 
found  to  have  various  biological  activities  [35,  36]. 

Cyclotides  were  initially  hypothesized  to  have  antimicro¬ 
bial  activities  based  on  the  presence  of  hydrophilic  and  hy¬ 
drophobic  patches,  which  give  an  amphipathic  character 
similar  to  classical  antimicrobial  peptides.  The  antimicrobial 
activities  of  cyclotides  have  been  reported  by  two  groups 
with  conflicting  results  on  the  potency  of  kalata  Bl  against 
Escherichia  coli  and  Staphylococcus  aureus.  In  one  study, 
kalata  Bl  was  active  against  S.  aureus ,  but  not  E.  coli  [16], 
and  in  the  second  study,  the  peptide  had  the  reverse  effect 
[17].  This  is  likely  due  to  the  technical  differences  in  the 
experiments.  Although  kalata  cyclotides  are  amphipathic,  the 
overall  charge  is  close  to  zero  at  neutral  pH,  making  it  un¬ 
likely  that  they  interact  with  bacterial  membranes  electro¬ 
statically  similar  to  classical  cationic  antimicrobial  peptides 
[37].  Further  studies  are  necessary  to  investigate  the  mecha¬ 
nisms  of  antimicrobial  action  given  the  growing  occurrence 
of  antibiotic  resistance  by  microorganisms. 


The  anti-HIV  properties  of  cyclotides  have  been  exten¬ 
sively  studied  [15,  38-40].  They  appear  to  mainly  act  by  in¬ 
hibiting  viral  entry  into  host  cells  as  studies  have  shown  a 
dose-dependent  increase  in  cytoprotection  [15].  This  sug¬ 
gests  the  peptides  may  block  binding  or  fusion  of  the  virus, 
but  the  mechanism  remains  unclear.  In  general,  cyclotide 
bioactivities  appear  to  involve  interactions  with  membranes 
and,  therefore,  this  may  be  a  mechanism  for  anti-HIV  activi¬ 
ties,  by  preventing  fusion  of  the  viral  and  host  cell  mem¬ 
branes.  Studies  have  shown  cyclotides  can  bind  to  model 
lipid  membranes  by  surface  plasmon  resonance  [41],  and 
that  binding  occurs  mainly  through  the  peptide  hydrophobic 
patches  exposed  on  the  surface  [42-44].  This  suggests  mem¬ 
brane  binding  may  be  one  mode  for  cyclotide  activity  against 
microorganisms. 

Cyclotide  interactions  with  membranes  have  also  been 
suggested  as  their  mechanism  for  cytotoxic  activity.  Studies 
have  demonstrated  antitumor  activities  of  cyclotides,  which 
were  selective  against  cancer  cell  lines  and  solid  tumors 
compared  to  normal  mammalian  cells  [18,  45,  46].  Cancer 
cells  differ  from  normal  cells  in  the  lipid  and  glycoprotein 
composition,  which  alters  the  overall  net  charge.  The  differ¬ 
ent  potencies  between  cyclotide  cytotoxicity  are  related  to 
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the  three-dimensional  structure  as  well  as  specific  amino 
acid  residues  within  the  sequence  [46,  47]. 

In  addition  to  having  antimicrobial  and  antitumor  activi¬ 
ties,  some  cyclotides  have  been  found  to  cause  extensive 
hemolysis  of  human  and  rat  erythrocytes  [16,  40,  48].  The 
cyclotide  kalata  B1  has  strong  hemolytic  activity,  although 
this  can  be  eliminated  by  mutation  to  Ala  of  any  one  of  eight 
residues  located  in  the  bioactive  face  of  the  molecule  [25].  A 
more  recent  study  has  also  shown  that  the  hemolytic  activity 
of  kalata  B1  could  be  reduced  or  completely  eliminated  by 
mutation  to  Lys  of  any  of  the  residues  involved  in  either  the 
bioactive  or  hydrophobic  faces  of  kalata  B1  [49].  The  same 
study  showed  that  the  hemolytic,  insecticidal  and  nemato- 
cidal  activities  of  the  different  kalata  B1  mutants  were  corre¬ 
lated,  indicating  there  may  be  a  common  mechanism  involv¬ 
ing  a  cyclotide-membrane  interaction  [49].  The  same  authors 
have  also  shown  recently  the  all  D-analogue  of  kalata  had 
similar  nematicidal  activity  than  the  native  peptide,  thus  cor¬ 
roborating  that  the  biological  activity  is  not  mediated  by  a 
cellular  receptor  [20].  In  contrast,  other  cyclo tides  such  as 
MCoTI-cyclotides  have  shown  little  or  no  hemolytic  activity, 
thus  demonstrating  the  diversity  of  cyclotide  properties. 

Another  cyclotide  with  interesting  biological  activity  is 
cyclopsychotride  (Cpt)  A.  Cpt  A  is  a  natural  cyclotide  ob¬ 
tained  from  the  organic  extract  of  the  tropical  plant  Psycho- 
tropia  longipes  that  has  been  reported  to  have  neurotensin 
inhibition  properties  [50].  Cpt  A  was  able  to  inhibit  neuro¬ 
tensin  binding  to  its  receptor  to  HT-29  cell  membranes  with 
an  IC50  ~  3  pM  and  increase  intracellular  Ca2+  levels  in  a 
concentration-dependent  manner,  which  could  not  be 
blocked  by  neurotensin  antagonists  [50].  Cpt  A,  however, 
showed  a  similar  activity  in  two  unrelated  cell  lines  that  did 
not  express  neurotensin  receptors  indicating  that  the  mecha¬ 
nism  of  action  is  unlikely  to  be  mediated  through  an  interac¬ 
tion  with  the  neurotensin  receptor  [50]. 

CHEMICAL  SYNTHESIS  OF  CYCLOTIDES 

Cyclotides  are  small  peptides,  approximately  30  amino 
acids  long,  and  therefore  can  be  readily  synthesized  by 
chemical  methods  using  solid-phase  peptide  synthesis  [51], 
Chemical  synthesis  using  a  solid-phase  approach  has  been 
utilized  to  generate  native  cyclotide  structures  as  well  as 
grafted  analogues  [52-56].  This  method  uses  an  intramolecu¬ 
lar  native  chemical  ligation  [57],  in  which  the  peptide  se¬ 
quence  contains  an  N-terminal  cysteine  and  an  a-thioester 
group  at  the  C-terminus  [58-60].  Both  tert-butyloxyxarbonyl 
(Boc)-  and  9-fluorenyloxycarbonyl  (Fmoc)-based  chemis¬ 
tries  have  been  used  to  incorporate  C-terminal  thioesters 
during  chain  assembly  (Boc)  [61-63]  or  using  a  safety-catch 
based  linkers  (Fmoc)  [60,  64-67].  Once  the  peptide  is 
cleaved  from  the  resin,  both  cyclization  and  folding  are  car¬ 
ried  out  in  a  single  pot  reaction. 

RECOMBINANT  EXPRESSION  OF  CYCLOTIDES 

Cyclotides  have  also  been  produced  recombinantly  in 
bacteria  through  intramolecular  native  chemical  ligation  (see 
above)  by  using  a  modified  protein  splicing  unit  or  intein 
(Fig.  3)  (see  reference  [68]  for  a  recent  review).  This  method 


can  generate  folded  cyclotides  either  in  vivo  or  in  vitro  using 
standard  bacterial  expression  systems  [26,  69,  70].  Inteins 
are  internal  self-processing  domains  that  undergo  post- 
translational  processing  to  splice  together  flanking  external 
domains  (exteins)  [71].  The  approach  uses  a  modified  intein 
fused  to  the  C-terminus  of  the  cyclotide  sequence  to  allow 
the  formation  of  an  a-thioester  at  the  C-terminus  of  recom¬ 
binant  polypeptides.  To  obtain  the  required  N-terminal  cys¬ 
teine  for  cyclization,  the  peptide  can  be  expressed  with  an  N- 
terminal  leading  peptide  signal,  which  can  be  cleaved  either 
in  vivo  or  in  vitro  by  proteolysis  or  auto-proteolysis  [68]. 
The  simplest  way  to  accomplish  this  is  to  introduce  a  Cys 
downstream  of  the  initiating  Met  residue.  Once  the  transla¬ 
tion  step  is  completed,  the  endogeneous  methionyl  amin- 
opeptidases  (MAP)  removes  the  Met  residue,  thereby  gener¬ 
ating  in  vivo  an  N-terminal  Cys  residue  [72-76].  The  N- 
terminal  Cys  can  then  capture  the  reactive  thioester  in  an 
intramolecular  fashion  to  form  a  backbone-cyclized  polypep¬ 
tide  (Fig.  3).  Additional  methods  to  generate  an  N-terminal 
cysteine  have  used  exogenous  proteases  to  cleave  the  leading 
signal  after  purification  or  in  vivo  by  co-expressing  the  pro¬ 
tease  [77].  For  example,  the  protease  Factor  Xa  has  been 
used  to  remove  an  N-terminal  recognition  sequence  prior  to  a 
cysteine  residue  [59,  78].  Other  proteases  that  have  been 
used  for  this  task  include  ubiquitin  C-terminal  hydrolase  [79, 
80],  tobacco  etch  virus  (TEV)  protease  [77],  enterokinase 
[81]  and  thrombin  [82].  The  N-terminal  pelB  leader  se¬ 
quence  has  been  used  recently  to  direct  newly  synthesized 
fusion  proteins  to  the  E.  coli  periplasmic  space  where  the 
corresponding  endogenous  leader  peptidases  [83,  84]  can 
generate  the  desired  N-terminal  cysteine-containing  protein 
fragment  [85].  Besides  proteases,  protein  splicing  has  also 
been  used  to  produce  recombinant  N-terminal  Cys- 
containing  polypeptides.  Some  inteins  can  be  modified  in 
such  a  way  that  cleavage  at  the  C-terminal  splice  junction 
can  be  accomplished  in  a  pH-  and  temperature-dependent 
fashion  [86-88]. 

Intein-mediated  backbone  cyclization  of  polypeptides  has 
also  been  recently  used  for  the  biosynthesis  of  the  Bowman- 
Birk  inhibitor  SFTI-1  [89].  The  biosynthesis  of  other  cyclic 
peptides  such  as  backbone-cyclized  a-defensins  and  natu¬ 
rally  occurring  9-defensins  is  currently  underway  in  our 
laboratory. 

Another  approach  to  generate  cyclic  peptides  in  vivo  is 
by  protein  trans-splicing.  This  approach  utilizes  a  self¬ 
processing  intein  that  is  split  into  two  fragments,  an  N-intein 
and  a  C-intein.  This  method  has  not  been  applied  yet  for  the 
biosynthesis  of  cyclotides,  but  has  been  used  to  produce 
other  natural  cyclic  peptides  and  genetically-encoded  librar¬ 
ies  of  small  cyclic  peptides  [90,  91],  It  should  be  noted, 
however,  that  these  systems  require  the  presence  of  specific 
amino  acid  residues  at  both  intein-extein  junctions  for  effi¬ 
cient  protein  splicing  to  occur  [90,  92,  93]. 

DESIGNING  CYCLOTIDES  WITH  NOVEL  BIO¬ 
LOGICAL  ACTIVITIES 

The  unique  properties  associated  with  the  cyclotide  scaf¬ 
fold  make  them  extremely  valuable  tools  in  drug  discovery. 
There  are  several  studies  that  have  used  the  cyclotide 
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Fig.  (3).  Intein-mediated  backbone  cyclization  for  the  biosynthesis  of  cyclotides  kalata  B1  (kBl)  and  MCoTl-II  in  E.  coli  cells  [69,  70J.  The 
backbone  cyclization  of  the  linear  cyclotide  precursor  is  mediated  by  a  modified  protein  splicing  unit  or  intein.  The  cyclized  product  then 
folds  spontaneously  in  the  bacterial  cytoplasm. 


molecular  scaffold  to  graft  peptide  sequences  and  to  generate 
libraries  for  the  purpose  of  engineering  cyclotides  with  novel 
biological  functions  (Fig.  4). 

The  plasticity  of  the  cyclotide  framework  was  first  dem¬ 
onstrated  by  substituting  hydrophobic  residues  in  loop  5  of 
kalata  B1  with  polar  and  charged  residues  [24].  The  mutated 
cyclo tides  retained  the  native  fold  of  kalata  Bl,  but  were  no 
longer  hemolytic  [24].  This  showed  that  cyclotides  were 


amendable  to  sequence  changes,  and  interestingly,  can  be 
modified  to  change  their  biological  functions. 

The  potential  of  grafted  cyclotides  was  first  demonstrated 
in  a  study  aimed  to  develop  novel  anticancer  peptide-based 
therapeutics  [94],  In  this  work  a  peptide  antagonist  of  angio¬ 
genesis  was  grafted  into  various  loops  of  the  kalata  Bl  scaf¬ 
fold  [94].  The  grafted  cyclo  tide  containing  the  vascular  en¬ 
dothelial  growth  factor  A  (VEGF-A)  antagonist  sequence  in 
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Fig.  (4).  Using  the  cyclotide  molecular  scaffold  for  drug  design.  Summary  of  the  changes  engineered  into  the  cyclotide  framework  to  intro¬ 
duce  new  biological  functions.  Peptide  sequences  have  been  successfully  grafted  into  loop  3  of  kalata  B 1  [94J  and  loops  1  and  6  of  MCoTI-II 
[95J  and  MCoTI-I  (unpublished  results),  respectively.  Cyclotide  based  libraries  have  been  generated  using  loops  2,  3,  5  and  6.  The  cyclotide 
structure  used  in  the  figure  corresponds  to  MCoTI-ll  (pdb  ID:  1IB9  [109]). 


loop  3  was  found  to  adopt  a  CCK  native  fold  and  was  bio¬ 
logically  active  at  low  micromolar  concentrations.  Addition¬ 
ally,  the  grafted  cyclotide  showed  increased  resistance  to 
degradation  in  human  serum.  This  study  demonstrates  the 
possibility  of  using  the  cyclotide  scaffold  to  stabilize  bioac¬ 
tive  peptide  epitopes,  which  may  normally  get  degraded. 

The  utility  of  the  cyclotide  scaffold  in  drug  design  has 
also  been  recently  shown  by  engineering  non-native  activi¬ 
ties  into  the  cyclotide  MCoTI-II.  MCoTI-II  is  a  naturally 
occurring  trypsin  inhibitor  (K,  ~  20  pM,  which  is  the  disso¬ 
ciation  constant  for  inhibitor  binding)  found  in  the  seeds  of 
Momordica  cochinchinensis ,  a  tropical  plant  from  the  squash 
family.  Mutation  of  the  PI  residue  in  the  active  loop  (loop  1, 
see  Fig.  1)  of  the  cyclotide  produced  several  MCoTI-II  ana¬ 
logs  with  different  specificities  towards  different  proteases 
[95].  Interestingly,  several  analogs  showed  activity  against 
the  foot-and-mouth-disease  virus  (FMDV)  3C  protease,  a 
Cys  protease  key  for  viral  replication,  in  the  low  micromolar 
range  [95].  This  is  the  first  reported  peptide-based  inhibitor 
for  this  protease  and  although  the  potency  was  relatively 
low,  this  study  demonstrates  the  potential  of  using  MCoTI- 
based  cyclotides  for  designing  novel  protease  inhibitors  [95] . 

In  a  more  recent  study,  the  same  authors  also  generated 
inhibitors  of  the  serine  proteases  [1-tryptase  and  human  leu¬ 
kocyte  elastase  (HLE)  using  the  backbone  of  MCoTI-II  [96]. 
P-Tryptase  is  implicated  in  allergic  and  inflammatory  disor¬ 
ders,  and  HLE  has  been  associated  with  respiratory  and  pul¬ 
monary  disorders.  Replacing  the  PI  residue  in  loop  1  pro¬ 
duced  several  MCoTI-II  mutants  (K6A  and  K6V)  with  activ¬ 


ity  against  HLE  with  K,  values  of  20-30  nM  [96]  and  K ,  val¬ 
ues  against  trypsin  above  1  pM.  Removal  of  the  SDGG  pep¬ 
tide  segment  in  loop  6  yielded  a  B-tryptase  inhibitor  with  a  K\ 
~  10  nM  without  significantly  altering  the  three-dimensional 
structure  as  determined  by  NMR  [96].  The  authors  hypothe¬ 
sized  that  deletion  of  the  aspartic  acid  residue  in  MCoTI-II 
should  improve  activity  by  removal  of  repulsive  electrostatic 
interactions  with  p-tryptase  thus  improving  the  inhibitory 
constant  against  P-tryptase  160-fold  when  compared  to  the 
wild-type  MCoTI-II. 

In  addition  to  displaying  biological  activities,  MCoTI- 
based  peptides  have  also  been  shown  to  cross  cell  mem¬ 
branes  in  macrophage  and  breast  cancer  cell  lines  through 
macropinocytosis  [21].  We  have  also  found  that  grafting  of  a 
helix  region  from  the  molluscum  contagiosum  virus  (MCV) 
FLICE-inhibitory  protein  (FLIP)  into  loop  6  of  MCoTI-I 
yielded  a  folded  cyclotide  able  to  cross  cell  membranes  and 
trigger  apoptosis  of  virally  infected  cells  (unpublished  re¬ 
sults),  thus  indicating  that  loop  6  may  be  used  for  grafting 
purposes  without  affecting  cellular-uptake. 

MCoTI  cyclotides  share  a  high  sequence  homology  with 
related  cystine-knot  trypsin  inhibitors  found  in  squash,  and 
could  be  considered  cyclized  homologs  of  these  protease 
inhibitors.  Squash  cystine-knot  trypsin  inhibitors  have  also 
successfully  been  used  to  graft  biological  activities.  For  ex¬ 
ample,  the  RGD  sequence  was  grafted  into  loop  1  of  EETI-II 
yielding  an  EETI-II  analog  with  platelet  inhibitory  activity 
[97].  The  engineered  proteins  were  much  more  potent  in 
inhibiting  platelet  aggregation  than  the  linear  grafted  pep- 
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tides,  thus  highlighting  the  importance  of  grafting  an  epitope 
into  a  stable  peptide-scaffold.  These  highly  stable  peptides 
could  have  clinical  use  for  the  treatment  of  patients  with 
acute  coronary  syndrome,  for  example. 

Additionally,  the  cyclotide  scaffold  can  be  engineered  or 
evolved  through  molecular  evolution  techniques  to  selec¬ 
tively  bind  extracellular  receptors  such  as  G  protein-coupled 
receptors  to  promote  or  block  cell  signaling  [98].  These  data 
demonstrate  the  versatility  of  the  cyclotide  scaffold  and 
highlight  the  extraordinary  pharmacological  properties  of 
MCoTI-cyclotides  and  related  linear  knottins,  thus  confirm¬ 
ing  the  potential  of  Cys-knotted  polypeptide  scaffolds  in 
peptide-based  drug  discovery  [27]. 

SCREENING  OF  CYCLOTIDE-BASED  LIBRARIES 

The  ability  to  create  cyclic  polypeptides  in  vivo  opens  up 
the  intriguing  possibility  of  generating  large  libraries  of  cy¬ 
clic  polypeptides.  Thus,  libraries  of  genetically-encoded  cy¬ 
clic  polypeptides  containing  billions  of  members  can  be 
readily  generated  using  standard  molecular  recombinant 
tools.  This  tremendous  molecular  diversity  allows  one  to 
perform  selection  strategies  mimicking  the  evolutionary 
processes  found  in  nature. 

There  are  several  examples  where  in  vivo  generated  li¬ 
braries  of  cyclic  peptides  have  been  used  for  rapid  selection 
of  biologically  active  peptides.  For  example  protein  trans¬ 
splicing  has  been  used  by  several  groups  for  the  generation 
of  libraries  of  small  cyclic  peptides  (<  8  residues)  in  bacterial 
and  mammalian  cells  [90,  92,  93,  99-101].  It  should  be 
pointed  out,  however,  that  the  use  of  protein  trans-splicing 
requires  particular  amino  acids  at  the  intein-extein  junctions 
for  efficient  trans-splicing  [93].  These  requirements  usually 
depend  on  the  type  of  split-intein  used,  and  seriously  limit 
the  diversity  of  the  libraries  that  can  be  generated  when  us¬ 
ing  small  cyclic  peptide  templates.  Our  group  has  recently 
demonstrated  the  expression  in  E.  coli  of  libraries  based  on 
the  cyclic  peptide  SFTI-1  (a  backbone  cyclized  Bowman- 
Birk  trypsin  inhibitor)  using  an  intein-mediated  backbone 
cyclization  approach  (see  above),  which  allows  the  biosyn¬ 
thesis  of  backbone-cyclized  polypeptides  without  any  se¬ 
quence  requirement  limitation  [89]. 

The  use  of  small  cyclic  peptides,  however,  which  techni¬ 
cally  can  be  considered  a  single  closed  loop  (or  2  loops  in 
the  case  of  SFTI-1),  could  also  limit  the  potency  of  the  pep¬ 
tides  selected,  especially  when  targeting  protein-protein  in¬ 
teractions  involving  large  binding  surfaces.  In  these  cases  the 
use  of  peptide  templates  such  as  cyclotides  with  multiple 
variable  loops  could  facilitate  the  selection  of  peptides  with 
higher  affinities. 

The  potential  for  generating  cyclotide  libraries  was  first 
explored  by  our  group  using  the  kalata  B1  scaffold  [69].  In 
this  work  wild-type  and  several  mutants  of  kalata  B1  were 
biosynthesized  using  an  intramolecular  native  chemical  liga¬ 
tion  facilitated  by  a  modified  protein  splicing  unit.  In  this 
work,  six  different  linear  versions  of  kalata  B1  were  gener¬ 
ated  and  expressed  in  E.  coli  as  fusions  to  a  modified  version 
of  the  yeast  vacuolar  membrane  ATPase  (VMA)  intein.  Re¬ 
sults  demonstrated  in  vitro  folding  and  cyclization  of  kalata 
B1  to  varying  degrees  depending  on  which  of  the  six  native 


cysteine  residues  was  at  the  N-terminus  after  cleavage  of  the 
initiation  methionine  by  endogenous  MAP.  Cleavage  and 
efficient  cyclization  of  the  different  linear  precursors  did  not 
occur  equally,  suggesting  the  amino  acid  residues  near  the 
intein  as  well  as  the  predisposition  to  adopt  a  native  fold  of 
the  corresponding  linear  precursor  may  determine  the  effi¬ 
ciency  of  the  cleavage/cyclization  step  [69].  This  informa¬ 
tion  was  used  to  express  a  small  library  based  on  the  kalata 
B1  scaffold.  This  library  was  cyclized  in  vitro  by  incubation 
with  a  redox  buffer  containing  reduced  glutathione  (GSH)  as 
a  thiol  co-factor,  thus  mimicking  the  intracellular  conditions, 
where  GSH  is  the  most  abundant  thiol  co-factor.  The  use  of 
GSH  allows  the  cyclization  and  folding  to  happen  in  one  step 
[69,  89].  Analysis  of  the  cyclization/folding  reaction  by 
HPLC  and  mass  spectrometry  revealed  that  all  the  members 
of  the  kalata  B1  based  library  were  expressed  and  processed 
with  similar  yields  to  give  the  corresponding  natively  folded 
eye lo tides  [69]. 

More  recently,  we  have  also  reported  the  biosynthesis  of 
a  genetically  encoded  library  of  MCoTI-I  based  cyclotides  in 
E.  coli  cells  [26].  The  cyclization/folding  of  the  library  was 
performed  either  in  vitro,  by  incubation  with  a  redox  buffer 
containing  glutathione,  or  by  in  vivo  self-processing  of  the 
corresponding  precursor  proteins.  The  bacterial  gyrase  A 
intein  from  Mycobacterium  xenopus  was  used  in  this  work 
[26].  This  intein  typically  express  at  higher  yields  than  the 
yeast  VMA  intein  in  E.  coli  expression  systems  [26].  The 
peptide  libraries  were  purified  and  screened  for  activity  us¬ 
ing  trypsin-immobilized  sepharose  beads,  and  then  analyzed 
by  HPLC  and  mass  spectrometry.  Out  of  27  mutations  stud¬ 
ied,  only  two  mutations,  G27P  and  I22G,  negatively  affected 
the  folding  of  the  resulting  cyclotides.  All  of  the  remaining 
eye  lo  tides  were  able  to  fold  with  similar  yields.  The  K6A 
mutant,  as  expected,  was  not  able  to  bind  trypsin.  This  resi¬ 
due  is  key  for  binding  to  the  specificity  pocket  of  trypsin, 
and  can  only  be  replaced  by  positively  charged  residues 
[102].  This  mutant  was  found  by  NMR  to  adopt  a  native  cy¬ 
clotide  structure,  confirming  that  the  lack  of  biological  activ¬ 
ity  was  due  to  the  mutation  and  not  to  the  ability  to  adopt  a 
native  fold.  It  is  interesting  to  note  that  by  modifying  the 
nature  of  this  residue,  the  specificity  of  the  corresponding 
MCoTI-cyclotide  can  be  changed  to  target  other  proteases 
[95,  96].  The  rest  of  the  MCoTI-based  library  members  were 
able  to  bind  trypsin,  suggesting  they  were  able  to  adopt  a 
native  cyclotide  fold  and  retained  biological  activity.  The 
affinity  of  each  peptide  was  analyzed  using  a  competitive 
trypsin-binding  assay.  The  mutants  had  a  wide  range  of  af¬ 
finity,  some  being  greater  than  wild  type  MCoTI-I.  The  pep¬ 
tides  with  less  affinity  were  mostly  found  in  loop  1  and  the 
C-terminal  region  of  loop  6,  both  well  conserved  among 
other  squash  trypsin  inhibitors.  Overall,  these  data  describe 
the  structural  requirements  for  correct  formation  of  MCoTI-I 
and  the  residues  that  are  key  to  modulating  trypsin  binding. 
To  our  knowledge,  this  is  the  first  time  that  the  biosynthesis 
of  a  genetically-encoded  library  of  MCoTI-based  cyclotides 
containing  a  complete  suite  of  amino  acid  mutants  is  re¬ 
ported. 

The  chemical  synthesis  of  a  complete  suite  of  Ala  mu¬ 
tants  for  kalata  B1  has  also  been  recently  reported  [25].  In 
this  work  all  the  mutants  were  fully  characterized  structur¬ 
ally  and  functionally.  The  results  indicated  that  only  two  of 
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the  mutations  explored  (W23A  and  P24A,  both  located  in 
loop  5,  see  Fig.  1)  prevented  folding  [25].  The  mutagenesis 
results  obtained  in  our  work  with  the  cyclotide  MCoTI-I 
show  similar  results  highlighting  the  extreme  robustness  of 
the  cyclotide  scaffold  to  mutations.  These  studies  show  that 
cyclotides  may  provide  an  ideal  scaffold  for  the  biosynthesis 
of  large  combinatorial  libraries  inside  living  bacterial  cells. 
These  genetically-encoded  libraries  can  then  be  screened  in¬ 
cell  for  biological  activity  using  high-throughput  flow  cy¬ 
tometry  techniques  for  the  rapid  selection  of  novel  biologi¬ 
cally  active  cyclotides  [69,  103,  104]. 

SUMMARY  AND  CONCLUDING  REMARKS 

In  summary,  cyclotides  are  a  novel  family  of  structurally 
related  globular  microproteins  with  a  unique  head-to-tail 
cyclized  backbone,  which  is  stabilized  by  three  disulfide 
bonds  [27,  105].  The  number  and  positions  of  cysteine  resi¬ 
dues  are  conserved  throughout  the  family,  forming  what  is 
called  cyclic  cystine-knot  (CCK)  motif  [35]  that  acts  as  a 
highly  stable  and  versatile  scaffold  on  which  5  hyper¬ 
variable  loops  are  arranged  (Fig.  1).  This  CCK  framework 
gives  the  cyclotides  exceptional  resistance  to  thermal  and 
chemical  denaturation,  and  enzymatic  degradation.  This  is 
particularly  important  for  the  development  of  peptide-based 
therapeutics  with  oral  bioavailability.  In  fact,  the  use  of  cy- 
clotide-containing  plants  in  indigenous  medicine  first  high¬ 
lighted  the  fact  that  the  peptides  are  resistant  to  boiling  and 
are  apparently  orally  bioavailable  [14,  32,  33].  Some  cy¬ 
clotides  have  also  been  shown  to  cross  the  cell  membrane 
[21]  thus  allowing  to  target  intracellular  protein  interactions, 
such  as  that  mediated  by  viral  FLIPs  to  prevent  cell-death  in 
virally  infected  cells  (unpublished  results).  Cyclotides  are 
also  medium-sized  polypeptides  and  therefore  can  be  readily 
synthesized  by  standard  solid-phase  peptide-synthesis  using 
either  Boc-  [54]  or  Fmoc-based  [55]  methodologies  thus 
allowing  the  introduction  of  non-natural  amino  acids  or  other 
chemical  modifications  for  lead  optimization.  They  can  also 
be  encoded  within  standard  cloning  vectors  and  readily  ex¬ 
pressed  in  bacteria  or  animal  cells  [26,  70],  thus  making 
them  ideal  substrates  for  molecular  evolution  strategies  to 
enable  generation  and  selection  of  compounds  with  optimal 
binding  and  inhibitory  characteristics  using  high  throughput 
cell-based  assays  [106].  All  of  these  characteristics  make 
cyclotides  appear  as  very  promising  leads  or  frameworks  for 
development  of  peptide-based  therapeutics  and  diagnostics 
[27,  68,  105,  107] 
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ABBREVIATIONS 

AEP  =  asparaginyl  endoproteinase 

Boc  =  tert-butyloxyxarbonyl 

EETI-II  =  Ecballium  elaterium  trypsin  inhibitor  II 

FMDV  =  foot-and-mouth-disease  virus 


FLIP 

= 

FLICE-inhibitory  protein 

Fmoc 

= 

9-fluorenyloxycarbonyl 

GSH 

= 

reduced  glutathione 

HIV 

= 

human  immunodeficiency  virus 

HLE 

= 

human  leukocyte  elastase 

HPLC 

= 

high  performance  liquid  chromatography 

MAP 

= 

methionyl  aminopeptidase 

MCoTI 

= 

Momordica  cochinchinensis  trypsin  inhibitor 

MCV 

= 

molluscum  contagiosum  virus 

NMR 

= 

nuclear  magnetic  resonance 

NTR 

= 

N-terminal  repeat 

SFTI 

= 

sunflower  trypsin  inhibitor 

SPPS 

= 

solid-phase  peptide  synthesis 

VMA 

= 

vacuolar  membrane  ATPase. 
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Backbone  Dynamics  of  Cyclotide  MCoTI-I  Free  and  Complexed  with 
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Trypsin** 

Shadakshara  S.  Puttamadappa,  Krishnappa  Jagadish,  Alexander  Shekhtman,  and 
Julio  A.  Camarero* 


Cyclotides  are  a  new  emerging  family  of  large  plant-derived 
backbone-cyclized  polypeptides  (about  28-37  amino  acids 
long)  that  share  a  disulfide-stabilized  core  (three  disulfide 
bonds)  characterized  by  an  unusual  knotted  arrangement.1'1 
Cyclotides  contrast  with  other  circular  polypeptides  in  that 
they  have  a  well-defined  three-dimensional  structure,  and 
despite  their  small  size  can  be  considered  as  microproteins. 
Their  unique  circular  backbone  topology  and  knotted 
arrangement  of  three  disulfide  bonds  makes  them  exception¬ 
ally  stable  to  thermal  and  enzymatic  degradation  (Scheme  1). 


MCoTI-I 

MCoTI-II 


loop  1  loop  2  loop  3  loop  5  loop  6 

VCPKILQRCRRDSDCPGACICRGNGYCGSGSDGG 

1  5  10  15  20  25  30 

VCPKILKKCRRDSDCPGACICRGNGYCGSGSDGG 


Scheme  i.  Primary  structure  and  disulfide  connectivities  of  MCoTI 
cyclotides.  Dark  gray  and  light  gray  connectors  represent  peptide  and 
disulfide  bonds,  respectively. 


Furthermore,  their  well-defined  structures  have  been  associ¬ 
ated  with  a  wide  range  of  biological  functions.*2,31  Cyclotides 
MCoTI-I/II  are  powerful  trypsin  inhibitors  (K,  ~  20-30  pvt) 
that  have  been  recently  isolated  from  the  dormant  seeds  of 
Momordica  cochinchinensis,  a  plant  member  of  the  cucurbi- 
taceae  family.141  Although  MCoTI  cyclotides  do  not  share 
significant  sequence  homology  with  other  cyclotides  beyond 
the  presence  of  the  three  cystine  bridges,  structural  analysis 
by  NMR  spectroscopy  has  shown  that  they  adopt  a  similar 
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backbone-cyclic  cystine-knot  topology.15,61  MCoTI  cyclotides, 
however,  show  high  sequence  homology  with  related  cystine- 
knot  squash  trypsin  inhibitors,141  and  therefore  represent 
interesting  molecular  scaffolds  for  drug  design. 17-101 

Determination  of  the  backbone  dynamics  of  these  fasci¬ 
nating  microproteins  is  key  for  understanding  their  physical 
and  biological  properties.  Internal  motions  of  a  protein  on 
different  timescales,  extending  from  picoseconds  to  a  second, 
have  been  suggested  to  play  an  important  role  in  its  biological 
function.1111  A  better  understanding  of  the  backbone  dynamics 
of  the  cyclotide  scaffold  will  be  extremely  helpful  for 
evaluating  its  utility  as  a  scaffold  for  peptide-based  drug 
discovery.  Such  insight  will  help  in  the  design  of  optimal 
focused  libraries  that  can  be  used  for  the  discovery  of  new 
cyclotide  sequences  with  novel  biological  activities.112,131 

Herein,  we  report  for  the  first  time  the  determination  of 
the  internal  dynamics  of  the  cyclotide  MCoTI-I  in  the  free 
state  and  complexed  with  trypsin.  Uniformly  15N-labeled 
natively  folded  cyclotide  MCoTI-I  was  recombinantly  pro¬ 
duced  in  Escherichia  coli  growing  in  minimal  M9  medium 
containing  15NH4C1  as  the  only  source  of  nitrogen.  Concom¬ 
itant  backbone  cyclization  and  folding  were  accomplished  by 
using  intramolecular  native  chemical  ligation114,151  in  combi¬ 
nation  with  a  modified  protein  splicing  unit  (Figure  SI, 
Supporting  Information).116-181  The  internal  dynamics  of 
cyclotide  MCoTI-I  was  obtained  from  15N  spin-lattice  and 
spin-spin  relaxation  times  and  15N('H)  heteronuclear  Over- 
hauser  effect  (NOE)  enhancements.1111  The  backbone  flexi¬ 
bility  was  characterized  by  the  square  of  the  generalized  order 
parameter,  S 2,  which  reveals  the  dynamics  of  backbone  NH 
groups  on  the  pico-  to  nanosecond  timescale.1'9,201  The  order 
parameter  satisfies  the  inequality  0<S2<  1,  in  which  lower 
values  indicate  larger  amplitudes  of  intramolecular  motions. 
Motions  on  the  milli-  to  microsecond  timescale  were  assessed 
by  the  presence  of  the  chemical  exchange  terms  in  the  spin- 
spin  relaxation. 

The  NMR  spectrum  and  S 2  values,  derived  from  the  15N 
relaxation  data  of  free  MCoTI-I,  are  shown  in  Figure  1  a  and 
d,  respectively.  Residues  Ile5  and  Gly23  of  free  MCoTI-I  were 
excluded  from  the  backbone  dynamics  analysis  since  the 
relaxation  data  could  not  be  fitted  to  a  monoexponential 
function,  possibly  as  a  result  of  chemical  exchange.1211  Gln7 
was  not  assigned  because  of  broadening  of  the  NMR  signal, 
presumably  caused  by  fast  exchange  with  water.  The  S 2  values 
for  free  MCoTI-I  show  that  most  of  the  NH  groups  of  the 
cyclotide  backbone  are  highly  constrained  with  S 2  values 
>  0.8,  thus  resembling  those  found  in  well-folded  globular 
proteins  (Table  1).  The  average  S2  value,  <  S2  > ,  for  free 
MCoTI-I  was  0.83  ±0.03.  This  value  is  similar  to  that  found 
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Figure  i.  NMR  analysis  of  the  backbone  dynamics  of  free  and  trypsin-bound  MCoTI-l.  a)  {15N!H}  NMR 
heteronuclear  single  quantum  correlation  (HSQC)  spectrum  of  free  MCoTI-l.  Chemical  shift  assignments 
of  the  backbone  amides  are  indicated,  b)  Overlay  of  the  {15N!H}  HSQC  spectra  of  free  (black)  and 
trypsin-bound  MCoTI-l  (red).  Residues  with  large  average  amide  chemical  shift  differences  between  two 
different  states  (>0.3  ppm)  are  indicated.  Peaks  that  are  broadened  in  trypsin-bound  MCoTI-l  are 
indicated  by  gray  circles,  c)  Average  amide  chemical  shift  difference  for  all  the  assigned  residues  in  free 
and  trypsin-bound  MCoTI-l.  The  chemical  shift  difference  was  calculated  as:  AQ  =  [(AQ2NH  +  0.04 Q2N)/ 
2]1/2,  where  AQNH  and  AQN  are  the  changes  in  the  amide  proton  and  nitrogen  chemical  shifts  (ppm), 
respectively,  d)  Order  parameter,  S2,  for  free  (black)  and  trypsin-bound  MCoTI-l  (red).  The  S2  value  is  a 
measure  of  backbone  flexibility  and  represents  the  degree  of  angular  restriction  of  the  N-H  vector  in  the 
molecular  frame.  The  MCoTI-l  loops  are  shown  at  the  top  of  (c)  and  (d).  Small  unassigned  peaks  in  the 
spectra  of  both  free  and  trypsin-bound  MCoTI-l  are  from  a  minor  conformation  of  the  protein,  and  result 
from  a  known  isomerization  of  the  backbone  at  an  Asp-Cly  sequence  in  loop  6  of  MCoTI-l. 


Table  i:  Average  order  parameters  of  structural  elements  in  MCoTI-l  in 
the  free  state  and  bound  to  trypsin. 


Structural  element  Sequence 

<  S2  > |a| 

Free  MCoTI-I 

<  52  >  Ibl 

Trypsin-MCoTI-l 

loop  1 

3-8 

0.81  ±0.01 

0.49  ±0.05 

loop  2 

10-14 

0.81  ±0.01 

0.62  ±0.07 

loop  3 

16-18 

0.84  ±0.02 

0.48w 

loop  4 

20 

0.88w 

0.76w 

loop  5 

22-26 

0.92  ±0.02 

0.61  ±0.01 

loop  6 

28-34 

0.76  ±0.05 

0.61  ±0.05 

cystine  knot 

2,10,15,19,21,27  0.84  ±0.02 

0.60  ±0.08 

[a]  S2  values  for  residues  5  and  23  from  free  MCoTI-l  are  not  included  in 
the  average  because  the  relaxation  data  could  not  be  fitted  to  a 
monoexponential  function,  [b]  S2  values  for  residues  2,  5,  8,  18,  19,  23, 
29,  31,  32,  and  33  from  trypsin-bound  MCoTI-l  are  not  included  in  the 
average  because  of  the  lack  of  signal  intensity  or  because  the  relaxation 
data  could  not  be  fitted  to  a  monoexponential  function,  [c]  <S2> 
contains  the  S2  value  for  a  single  residue. 


for  the  six  cystine  residues 
involved  in  the  cystine  knot 
(<S2>  =  0.84  ±0.02)  and  is 
considerably  larger  than  those 
found  for  other  linear  squash 
trypsin  inhibitors  (<  S2  >  =  0.71 
for  trypsin  inhibitor  from 
Cucurbita  maxima  (CMTI-III, 
78%  homology  with  MCoTI- 
I)),[22]  thus  indicating  the 
importance  of  the  backbone 
cyclization  to  rigidifying  the 
overall  structure.  Loops  2 
through  5  in  free  MCoTI-I 
showed  <  S1  >  values  >  0.8. 
In  particular,  loop  5  showed 
an  <S2>  value  of  0.92  ±  0.02, 
well  above  the  average  for  the 
molecule  and  the  cystine  knot. 
In  contrast,  loops  1  and  6 
showed  <S2>  values  below 
the  average  for  the  molecule. 
Thus,  loop  6,  which  is  believed 
to  act  as  a  very  flexible  linker  to 
allow  cyclization,1231  had  an 
<S2>  value  of  0.76  ±0.17 
with  only  two  residues,  Asp32 
and  Gly33,  having  values  below 
0.6.  Despite  this  small  <S2> 
value,  residues  in  loop  6  did  not 
require  significant  chemical 
exchange  terms  (Figure  S2  and 
Table  SI,  Supporting  Informa¬ 
tion),  which  suggests  that  the 
mobility  observed  arises  mostly 
from  local  vibrations. 

The  <S2>  value  for 
loop  1,  which  is  responsible  for 
binding  trypsin,  was  0.81  ± 
0.07.  This  value  is  ^90%  of 
the  average  value  for  free 
MCoTI-I.  Residue  Leu6  in 
loop  1  also  required  chemical  exchange  terms  to  be  consid¬ 
ered,  thus  indicating  the  existence  of  intramolecular  con¬ 
formational  exchange  on  the  micro-  to  millisecond  timescale. 
The  mobility  observed  in  loop  1  at  both  nano-  to  picosecond 
and  millisecond  timescales  has  also  been  described  in  other 
trypsin  inhibitors,122-24,251  and  it  has  been  suggested  to  play  an 
important  role  in  receptor-ligand  binding!111 

To  explore  whether  that  was  the  case  in  the  MCoTI 
cyclotides,  we  next  studied  the  effect  of  ligand  binding  on  the 
backbone  dynamics  of  MCoTI-I  (Figures  1  and  2).  To  exclude 
the  possibility  that  trypsin  could  cleave  or  scramble  the 
disulfide  bonds  of  MCoTI-I  upon  complex  formation,  we  used 
a  competition  experiment  of  trypsin-[15N]MCoTI-I  with 
unlabeled  MCoTI-I.  The  results  indicated  that  the  structure 
of  MCoTI-I  is  unaltered  upon  trypsin  binding  (Figure  S3, 
Supporting  Information).  Trypsin  binding  led  to  large 
(>  0.3  ppm)  and  specific  changes  in  the  chemical  shifts  of 
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Figure  2.  Trypsin  binding  to  MCoTI-l  affects  the  MCoTI-l  backbone  dynamics,  a)  Ribbon  and 
b)  surface  diagrams  of  the  trypsin-MCoTI-l  interaction  map.  Red  numbers  indicate  the 
positions  of  the  MCoTI-l  loops.  The  MCoTI-l  residues  with  a  large  chemical  shift  difference 
(>0.3  ppm)  are  in  blue,  c)  Changes  in  the  MCoTI-l  order  parameter  as  a  result  of  binding  to 
trypsin.  Residues  with  Sf2— Sb2>  0.2,  where  Sf2  and  Sb2  are  the  order  parameters  of  the  free  and 
trypsin-bound  MCoTI-l,  respectively,  are  depicted  in  red.  MCoTI-l  residues  that  were  broadened 
in  {15N,1H}  HSQC  because  of  binding  to  trypsin  are  shown  in  green.  The  structure  of  free 
MCoTI/ll  (PDB  code:  1 IB9)161  was  used  to  illustrate  the  changes  of  MCoTI-l  dynamics  arising 
from  trypsin  binding. 


the  residues  located  in  loop  1  (Cys2,  Lys4,  Ile5,  Arg8),  loop  3 
(Cysl5  and  Alal8),  and  loop  6  (Vail)  (Figures  lc  and  2b  and 
Table  S2,  Supporting  Information).  NMR  signals  of  Cys2, 
Ile5,  Cysl9,  Ser29,  Ser31,  Asp32,  and  Gly33  were  significantly 
broadened,  presumably  because  of  intramolecular  chemical 
exchanges  in  the  trypsin-MCoTI-I  complex.  Arg8,  Alal8,  and 
Gly23  were  excluded  from  backbone  dynamics  analysis 
because  their  peaks  were  broadened  in  the  ^Nj'H)  NOE 
spectra.  Similar  findings  have  already  been  reported  for  other 
biomolecular  interactions.1261 

We  used  these  changes  to  construct  the  trypsin-MCoTI-I 
interaction  surface.  The  binding  surface  is  contiguous  and 
spans  46%  of  the  total  molecular  area  of  MCoTI-I  (Fig¬ 
ure  2b).  As  expected,  the  major  difference  in  the  backbone 
dynamics  was  observed  in  the  binding  loop  (Table  1),  where 
the  mobility  in  the  nano-  to  picosecond  timescale  was 
increased  in  MCoTI  once  bound  to  trypsin.  Loop  1  showed 
<S2>  =  0.49  ±  0.02,  which  is  much  lower  than  the  value  for 
the  rest  of  the  molecule  (<S2>  =  0.65  ±0.07).  Several 
residues  in  loop  2  (Cys9,  ArglO,  Serl3,  and  Aspl4),  loop  3 
(Glyl7),  and  loop  5  (Cys21  and  Arg22)  also  showed  signifi¬ 


cantly  lower  values  of  S 2  upon  complex 
formation  (Figures  1  d  and  2c).  It  is  likely 
that  the  increase  in  mobility  observed  in 
these  loops  may  help  to  accommodate  the 
increased  flexibility  of  the  binding  loop 
(Figure  2  c). 

Since  our  data  clearly  show  that 
backbone  flexibility  of  cyclotide 
MCoTI-I  increases  significantly  upon 
binding  to  trypsin,  we  decided  to  estimate 
the  contribution  of  these  motions  to  the 
overall  Gibbs  free  energy  of  binding 
(AG).  The  energetic  benefit  of  this 
increase  in  backbone  flexibility  can  be 
estimated  from  the  experimental  relaxa¬ 
tion  data,  by  using  the  experimentally 
measured  order  parameters  .S'2.127  The 
estimated  AG  value  was  approximately 
— 62kJmoL1  at  298  K.  This  value  is 
almost  identical  to  the  calculated  value 
from  the  trypsin  inhibitory  constant  of 
MCoTI-I  (7Ci^20pM,[2S1  A  Gss 
— blkJmoL1).  The  calculated  entropic 
contribution  (— TAS)  at  the  same  temper¬ 
ature  was  approximately  — 46kJmoL1. 
These  results  highlight  the  importance 
of  the  backbone  entropic  term  to  the 
formation  of  the  trypsin-MCoTI-I  com¬ 
plex,  although  a  more  detailed  thermo¬ 
dynamic  analysis  that  also  includes  the 
side-chain  motions  may  be  required. 

In  summary,  we  have  reported  the 
backbone  dynamics  of  the  cyclotide 
MCoTI-I  in  the  free  state  and  complexed 
to  its  binding  partner  trypsin  in  solution. 
To  our  knowledge  this  is  the  first  time  the 
backbone  dynamics  of  a  natively  folded 
cyclotide  has  been  reported.  This  has 
been  possible  because  of  the  use  of  modified  protein  splicing 
units  for  the  heterologous  expression  of  folded  cyclotides 
using  bacterial  expression  systems117,18,291  to  incorporate 
NMR-active  nuclei  such  as  15N.  Our  results  on  the  backbone 
dynamics  of  free  cyclotide  MCoTI-I  confirm  that  MCoTI-I 
adopts  a  well-folded  and  highly  compact  structure  with  an 
<S2>  value  of  0.83.  This  value  is  similar  to  those  found  in  the 
regions  of  well-folded  proteins  with  restricted  backbone 
dynamics. 

The  results  also  indicate  that  the  trypsin-binding  loop 
(loop  1)  has  a  smaller  S2  value  than  the  average  value  for  the 
whole  molecule,  thus  indicating  a  higher  mobility  of  this 
region  in  the  pico-  to  nanosecond  timescale.  This  region  also 
showed  significant  conformational  exchange  motions  in  the 
micro-  to  millisecond  timescale.  Loop  6  also  possesses  a 
higher  mobility  in  the  pico-  to  nanosecond  timescale  than  the 
averaged  value  for  MCoTI-I,  although  no  significant  con¬ 
formational  exchange  motions  were  detected  in  the  micro-  to 
millisecond  timescale.  This  result  is  intriguing  since  this  loop 
contains  a  potentially  flexible  Gly-Ser-rich  sequence  that  is 
mostly  absent  among  other  linear  trypsin  squash  inhibitors, 


7032  www.angewandte.org 


©  2010  Wiley-VCH  Verlag  GmbH  &  Co.  KGaA,  Weinheim 


Angew.  Chem.  Int.  Ed.  2010,  49,  7030-7034 


Angewandte 

International  Edition  Chemie 


and  therefore  it  was  thought  to  be  a  highly  flexible  linker  to 
allow  cyclization.  More  surprising,  however,  was  the  fact  that 
the  backbone  of  MCoTI-I,  and  especially  loop  1,  increased 
the  pico-  to  nanosecond  mobility  when  bound  to  trypsin.  This 
interesting  result  has  already  been  observed  in  other  high- 
affinity  protein-protein  interactions.130,311 

The  thermodynamic  analysis  of  the  backbone  contribu¬ 
tion  to  the  formation  of  the  trypsin-MCoTI-I  complex  by 
using  measured  S2  values  also  revealed  the  importance  of  the 
backbone  entropic  term  in  the  formation  of  the  complex. 
Similar  findings  have  also  been  found  in  other  protease 
inhibitors.1321  This  increment  in  backbone  mobility  may  help 
to  minimize  the  entropic  penalties  required  for  binding. 
Hence,  we  also  observed  in  the  HSQC  spectrum  of  the 
trypsin-MCoTI-I  complex  the  appearance  of  a  signal  corre¬ 
sponding  to  the  e-NH,+  of  Lys4  located  in  loop  1,  which 
suggests  that  the  ammonium  group  is  protected  and  more 
rigid  when  forming  the  complex  (Figure  3).  This  is  the  only 


Figure 3.  e-NH3+  of  Lys4  is  protected  from  fast  exchange  with  the 
solvent  in  trypsin-bound  MCoTI-I.  {15N,'H]-HSQC  spectra  of  free  (a) 
and  trypsin-bound  MCoTi-l  (b)  were  collected  at  room  temperature 
with  the  15N-carrier  position  at  82  ppm  and  15N  radio-frequency  field 
strengths  of  5.2  kHz  for  90°  and  180°  pulses  and  1.2  kHz  for 
composite  decoupling  during  acquisition. 

Lys  residue  present  in  the  sequence  of  MCoTI-I  (Scheme  1) 
and  therefore  it  can  be  unambiguously  assigned.  This  residue 
is  key  for  binding  to  trypsin1291  and  is  responsible  for  binding 
to  the  specificity  pocket  of  trypsin.  This  cross-peak  was  totally 
absent  in  the  free  MCoTI-I  sample,  which  indicates  that  the  e- 
NH3+  of  Lys4  is  less  rigid  and  rapidly  exchanging  with  solvent 
(Figure  3  a). 

Similarly,  the  broadening  of  aliphatic  resonances  for  Arg 
side  chains  with  essentially  rigid  guanidinium  groups  (that  is, 
eN— H  bond  vectors)  has  also  been  described  for  protein- 
peptide  complexes.1261  Palmer  and  co-workers  have  recently 
suggested  that  this  dynamic  decoupling  between  the  side- 
chain  terminus  from  the  rest  of  the  aliphatic  part  of  the  side 


chain  may  be  a  general  biophysical  strategy  for  maximizing 
residual  side-chain  and  potentially  backbone  conformational 
entropy  in  proteins  and  their  complexes,1331  which  is  in 
agreement  with  our  observations  regarding  the  increase  in 
MCoTI-I  backbone  mobility  upon  complex  formation. 

We  have  also  mapped  the  binding  surface  of  MCoTI-I 
once  bound  to  trypsin.  Major  changes  in  chemical  shifts  were 
observed  for  the  solvent-exposed  residues  located  in  loops  1 
(Lys4,  Ile5,  Arg8),  3  (Alal8),  and  6  (Vail)  (Figures  lc  and 
2b).  In  agreement  with  these  results,  we  have  recently  shown 
that  the  introduction  of  nonconservative  mutations  in  these 
positions  has  a  negative  effect  on  the  affinity  for  trypsin,1291 
thus  indicating  that  they  may  be  in  close  contact  with  the 
protease  at  the  binding  interface  of  the  molecular  complex. 

Cyclotides  present  several  characteristics  that  make  them 
appear  as  promising  leads  or  frameworks  for  peptide  drug 
design.17-81  Investigation  of  the  backbone  dynamics  is  crucial 
for  a  better  understanding  of  the  dynamic  structural  proper¬ 
ties  of  the  cyclotide  scaffold  and  how  it  affects  the  mode  of 
binding  of  these  interesting  molecules.  The  reported  data  will 
help  in  the  design  of  cyclotide-based  libraries  for  molecular 
screening  and  the  selection  of  de  novo  sequences  with  new 
biological  activities,  or  the  development  of  grafted  analogues 
for  use  as  peptide-based  drugs.19,101 
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General  materials  and  methods 

Analytical  HPLC  was  performed  on  a  HP1100  series  instrument  with  220  nm  and  280  nm 
detection  using  a  Vydac  Cl  8  column  (5  pm,  4.6  x  1 50  mm)  at  a  flow  rate  of  1  mL/min. 
Semipreparative  HPLC  was  performed  on  a  Waters  Delta  Prep  system  fitted  with  a  Waters  2487 
Ultraviolet- Visible  (UV-vis)  detector  using  a  Vydac  Cl  8  column  (15-20  pm,  10  x  250  mm)  at  a 
flow  rate  of  5  mL/min.  All  runs  used  linear  gradients  of  0.1%  aqueous  trifluoroacetic  acid  (TFA, 
solvent  A)  vs.  0.1%  TFA,  90%  acetonitrile  in  H20  (solvent  B).  UV-vis  spectroscopy  was  carried 
out  on  an  Agilent  8453  diode  array  spectrophotometer.  Electro-spray  mass  spectrometry  was 
performed  on  an  Applied  Biosystems  API  3000  triple  quadrupole  mass  spectrometer. 

Calculated  masses  were  obtained  by  using  ProMac  vl.5.3.  Protein  samples  were  analyzed  by 
SDS-PAGE  on  Invitrogen  (Carlsbad,  CA)  4-20%  Tris-Glycine  Gels.  The  gels  were  then  stained 
with  Pierce  (Rockford,  IL)  Gelcode  Blue,  photographed/digitized  using  a  Kodak  (Rochester,  NY) 
EDAS  290,  and  quantified  using  NIH  Image-J  software  (http://rsb.info.nih.gov/ij/).  DNA 
sequencing  was  performed  by  DNA  Sequencing  and  Genetic  Analysis  Core  Facility  at  the 
University  of  Southern  California  using  an  ABI  3730  DNA  sequencer,  and  the  sequence  data 
was  analyzed  with  DNAStar  (Madison,  Wl)  Lasergene  v8.0.2.  All  chemicals  were  obtained  from 
Sigma-Aldrich  (Milwaukee,  Wl)  unless  otherwise  indicated. 

Construction  of  MCoTI-l  expression  plasmid 

Plasmids  expressing  the  MCoTI-l  linear  precursor  was  constructed  using  the  pTXBI  expression 
plasmids  (New  England  Biolabs),  which  contain  an  engineered  Mxe  Gyrase  intein,  respectively, 
and  a  chitin-binding  domain  (CBD)  as  previously  described  [1]. 

Expression  and  purification  of  recombinant  proteins 


MCoTI-l  cyclotide  were  produced  and  characterized  as  previously  described  [1].  Briefly, 
BL21(DE3)  cells  (Novagen,  San  Diego,  CA)  were  transformed  with  the  MCoTI-l  plasmid  (see 
above).  Expression  was  carried  out  in  M9  minimal  medium  (6x1  L)  containing  0.1%  15NH4CI 
and  100  mg/ml  ampicillin  as  previously  described  [1].  Briefly,  5  ml_  of  an  overnight  starter  culture 
derived  from  a  single  clone  was  used  to  inoculate  1  L  of  M9  media.  Cells  were  grown  to  an  OD 
at  600  nm  of  =  0.5  at  37°C,  and  expression  was  induced  by  adding  isopropyl-|3-D-thio- 
galactopyranoside  (IPTG)  to  a  final  concentration  of  0.3  mM  at  30°  C  for  4  h.  The  cells  were 
then  harvested  by  centrifugation.  For  fusion  protein  purification,  the  cells  were  resuspended  in 
30  mL  of  lysis  buffer  (0.1  mM  EDTA,  1  mM  PMSF,  50  mM  sodium  phosphate,  250  mM  NaCI 
buffer  at  pH  7.2  containing  5%  glycerol)  and  lysed  by  sonication.  The  lysate  was  clarified  by 
centrifugation  at  15,000  rpm  in  a  Sorval  SS-34  rotor  for  30  min.  The  clarified  supernatant  was 
incubated  with  chitin-beads  (1  mL  beads/L  cells,  New  England  Biolabs),  previously  equilibrated 
with  column  buffer  (0.1  mM  EDTA,  50  mM  sodium  phosphate,  250  mM  NaCI  buffer  at  pH  7.2)  at 
4°  C  for  1  h  with  gentle  rocking.  The  beads  were  extensively  washed  with  50  bead-volumes  of 
column  buffer  containing  0.1%  Triton  X100  and  then  rinsed  and  equilibrated  with  50  bead- 
volumes  of  column  buffer. 

Concomitant  cleavage,  cyclization  and  folding  of  [15N]-MCoTI-l  cyclotide  with  reduced 
glutathione  (GSH) 

Purified  [15N]-MCoTI-lntein-CBD  fusion  protein  was  cleaved  with  50  mM  GSH  in  degassed 
column  buffer  as  previously  described  [1].  The  cyclization/folding  reactions  were  kept  for  up  to  2 
days  at  25 °C  with  gentle  rocking.  The  supernatant  of  the  cyclization  reaction  was  separated  by 
filtration  and  the  beads  were  washed  with  additional  column  buffer  (1  column  volume  per  each 
mL  of  beads).  The  supernatant  and  washes  were  pooled,  and  the  oxidized-cyclotide  was 
purified  by  semipreparative  HPLC  using  a  linear  gradient  of  15-45%  solvent  B  over  30  min 


yielding  =1.8  mg  of  [15N]-MCoTI-l  (  *  0.3  mg/L).  Purified  product  was  characterized  by  HPLC 
and  ES-MS  (Fig.  SI). 

Purification  of  MCoTI-l  using  trypsin-sepharose  beads. 

Preparation  of  trypsin-sepharose  beads:  NHS-activated  Sepharose  was  washed  with  15 
volumes  of  ice-cold  1  mM  HCI.  Each  volume  of  beads  was  incubated  with  an  equal  volume  of 
coupling  buffer  (50  mM  NaCI,  200  mM  sodium  phosphate  buffer  at  pH  6.0)  containing  2  mg  of 
porcine  pancreas  trypsin  type  IX-S  (14,000  units/mg)  for  3  h  with  gentle  rocking  at  room 
temperature.  The  beads  were  then  rinsed  with  10  volumes  of  coupling  buffer,  and  incubated 
with  excess  coupling  buffer  containing  100  mM  ethanolamine  (Eastman  Kodak)  for  3  h  with 
gentle  rocking  at  room  temperature.  Finally,  the  beads  were  washed  with  50  volumes  of  wash 
buffer  (200  mM  sodium  acetate  buffer  at  pH  3,  250  mM  NaCI)  and  stored  in  one  volume  of  wash 
buffer.  =30  mL  of  clarified  lysates  (in  vivo  obtained  MCoTI-l)  or  10  ml_  of  GSH-induced 
cyclization/folding  reaction  mixture  (in  vitro  obtained  MCoTI-l)  were  typically  incubated  with  1.0  - 
2.0  mL  of  trypsin-sepharose  for  one  hour  at  room  temperature  with  gentle  rocking,  and 
centrifuged  at  3000  rpm  for  1  min.  The  beads  were  washed  with  50  volumes  of  PBS  containing 
0.1%  Triton  XI 00,  then  rinsed  with  50  volumes  of  PBS,  and  drained  of  excess  PBS.  Bound 
peptides  were  eluted  with  2.0  mL  of  8  M  GdmHCI  at  pH  =4.0  and  fractions  were  desalted  and 
analyzed  by  RP-HPLC,  ES-MS/MS  and  NMR  (Fig.  SI). 

NMR  Spectroscopy 

NMR  samples  of  free  MCoTI-l  were  prepared  by  dissolving  [15N]-MCoTI-l  into  10  mM  potassium 
phosphate  buffer  in  90%  H2O/10%  2H20  (v/v)  at  pH  =  7.0  to  a  concentration  of  0.2  mM.  The 
trypsin-MCoTI-l  complex  sample  was  prepared  by  titrating  trypsin  (porcine  pancreas  trypsin  type 
IX-S,  14,000  units/mg)  into  0.2  mM  [15N]-MCoTI-l  solution.  Complex  formation  was  monitored 
by  NMR  spectroscopy.  No  changes  in  the  NMR  spectrum  were  seen  after  the  molar  ratio 
between  trypsin  and  MCoTI-l  reached  1:1.  The  protease  inhibitor  4-(2-aminoethyl) 
benzenesulfonyl  fluoride  hydrochloride  (AEBSF)  was  added  to  inhibit  the  residual  protease 


activity  of  unbound  trypsin.  Long  term  stability  of  the  NMR  sample  was  monitored  by  the  NMR 
spectra  of  trypsin-[15N]-MCoTI-l.  No  changes  in  the  NMR  peak  positions  of  trypsin-[15N]-MCoTI- 
I  were  observed  over  the  period  of  two  months.  We  used  competition  experiment  of  trypsin- 
[15N]-MCoTI-I  with  unlabeled  MCoTI-l  to  exclude  possibility  that  trypsin  bound  cyclotide  was 
cleaved  or  had  scrambled  disulfide  bonds  (Figure  S3).  By  adding  ten  times  molar  excess  of 
unlabeled  MCoTI-l  into  trypsin-[15N]-MCoTI-l  NMR  sample  we  were  able  to  reconstitute  the 
NMR  spectrum  of  free  [15N]-MCoTI-l  (Figure  S3C).  This  experiment  proved  that  [15N]-MCoTI-l 
was  not  modified  by  complexing  with  trypsin. 

NMR  data  were  acquired  on  Bruker  Avance  II  700  MHz  and  Avance  III  500  MHz  spectrometers 
equipped  with  ultra-sensitive  triple  resonance  cryoprobes  capable  of  applying  pulsed  field 
gradients  along  the  z-axis.  All  experiments  were  conducted  at  25 °C. 

Assignments  for  the  amide  nitrogens  and  protons  (Table  S3)  of  free  and  trypsin  bound  MCoTI-l 
were  obtained  by  using  standard  procedures  as  previously  described  [1].  Briefly,  heteronuclear 
3D  NMR  experiments,  ^{^NJ-TOCSY-HSQC  and  ^{^NJ-NOESY-HSQC,  were  performed 
according  to  standard  procedures  [8,  9]  with  spectral  widths  of  12  ppm  in  proton  dimensions  and 
35  ppm  in  nitrogen  dimension.  The  carrier  frequency  was  centered  on  the  water  signal,  and  the 
solvent  was  suppressed  by  using  WATERGATE  pulse  sequence.  TOCSY  (spin  lock  time  of  80 
and  60  ms  for  free  and  trypsin  bound  MCoTI-l,  respectively)  and  NOESY  (mixing  time  of  150 
and  80  ms  for  free  and  trypsin  bound  MCoTI-l,  respectively)  spectra  were  collected  using  1024 
t3  points,  256  t2  and  128  ti  blocks  of  16  transients.  Spectra  were  processed  by  using  Topspin 
1.3  (Bruker).  Each  3D-data  set  was  apodized  by  90°-shifted  sinebell-squared  in  all  dimensions, 
and  zero  filled  to  1024  x  512  x  256  points  prior  to  Fourier  transformation.  Spectra  were 
analyzed  by  using  the  NMR  software  program  CARA  [3]. 

The  measurements  of  the  spin-lattice  (Ri)  and  spin-spin  (R2)  relaxation  rates,  as  well  as  steady 
state  ^N^HJ-nuclear  Overhauser  effect  (NOE)  measurements  were  performed  at  both  500  MHz 
and  700  MHz  for  MCoTI-l  in  the  free  and  trypsin  bound  states.  The  pulse  sequence  used  to 


record  15N  R1:  R2  and  steady  state  ^N^HJ-NOE  spectra  were  used  as  described  [2]  with  a  slight 
modification  to  include  Watergate  techniques  for  eliminating  the  water  resonance.  Decoupling 
of  15N  spins  during  acquisition  was  performed  using  a  WALTZ-16  composite  pulse  sequence 
with  a  filed  strength  of  1.35  kHz.  The  observed  chemical  shifts  were  determined  relative  to 
the  internal  reference,  sodium  2,2-dimethyl-2-silapentane-5-sulfonate  (DSS).  A  recycle  delay  of 
1.5  s  was  used  in  the  Ri  and  R2  relaxation  measurements  for  both  free  and  trypsin  bound 
samples.  The  following  delays  were  used  to  measure  the  R2  values  at  500  MHz  for  free  MCoTI- 
I:  20,  30,  50,  60,  80,  100,  140,  300,  and  500  ms;  at  500  MHz  for  trypsin  bound  MCoTI-l:  10,  20, 
30,  40,  60,  80,  100,  and  300  ms;  at  700  MHz  for  free  MCoTI-l:  20,  30,  50,  80,  100,  150,  300, 
and  5000  ms;  and  at  700MHz  for  trypsin  bound  MCoTI-l:  10,  20,  50,  100,  150,  and  300  ms.  For 
Rt  measurements,  the  following  variable  relaxation  delays  were  used  at  500  MHz,  for  free 
MCoTI-l:  20,  60,  100,  200,  400,  500,  700,  1000,  and  2000  ms;  at  500  MHz  for  trypsin  bound 
MCoTI-l:  20,  370,  570,  770,  1000,  and  1500  ms;  at  700  MHz  for  free  MCoTI-l:  20,  50,  200,  370, 
520,  and  850  ms;  and  at  700  MHz  for  trypsin  bound  MCoTI-l:  20,  50,  100,  250,  500,  770,  and 
1000  ms.  R2  measurements  utilized  a  900  ps  delay  between  sequential  15N  pulses  in  the 
CPMG  pulse  train  for  attenuating  the  15N  signal  loss  during  a  R2  relaxation  period.  The  field 
strength  of  the  refocusing  pulses  in  the  CPMG  pulse  sequence  was  1  kHz. 

Heteronuclear  steady-state  ^N^HJ-NOE  were  determined  from  spectra  recorded  with  (NOE) 
and  without  saturation  of  protons,  where  saturation  was  achieved  by  a  train  of  120 “pulses 
separated  for  5  ms  for  1  s.  The  recycle  time  between  the  experiments  was  5  s.  The  NOE 
measurements  were  performed  using  a  total  of  256  transients  per  increment  in  the  indirect  15N 
dimension.  The  15N  dimension  was  zero  filled  to  256  real  data  points.  Heteronuclear  NOEs  were 
calculated  according  to  the  equation:  ii=lsat/lUnsat ,  in  which  lsal  and  lunsat  are  the 

experimental  peak  intensities  measured  from  spectra  recorded  with  and  without  proton 
saturation.  All  data  were  analyzed  by  using  CARA  [3].  The  R i  and  R2  relaxation  rates  were 
determined  by  fitting  the  peak  intensity  1(f)  to  a  single-exponential  function  given  by  \{t)=\0eRt, 


where  t  is  the  NMR  time  delay,  using  MATLAB  program  RELAXFIT  [4].  Hydrodynamic 
parameters  of  free  and  trypsin  bound  MCoTI-l  were  calculated  by  using  RotDif  [7].  Residues 
possessing  low  NOE  values,  less  than  0.6  for  free  MCoTI-l  and  less  than  0.2  for  trypsin  bound 
MCoTI-l,  or  participating  in  slow  ps-ms  exchanges  were  removed  from  calculations.  The 
anisotropy  of  rotational  diffusion  for  free  and  trypsin  bound  MCoTI-l  was  2.8  ±  0.3  and  1 .5  ±  0.2, 
respectively  [7].  The  overall  correlation  times,  rc,  for  free  and  trypsin  bound  MCoTI-l  was  2.4  ± 
0.2  ns  and  12.52  ±  0.5  ns,  respectively  [7],  which  are  consistent  with  the  increased  molecular 
weight  of  trypsin  bound  MCoTI-l.  Analysis  of  the  micro-dynamic  motional  parameters  using  the 
Lipari-Szabo  formalism  was  performed  utilizing  DYNAMICS  [5,  6]  and  Ri,  R2  and  ^N^HJ-NOE 
data  at  500  MHz  and  700  MHz.  Errors  (68.3%  confidence  limits)  in  the  micro-dynamic 
parameters  were  obtained  from  the  analytic  inverse  covariance  matrixes  of  the  fits  [4],  This 
method  of  error  estimation  is  crucial  because  it  takes  into  account  both  the  random  errors  as 
well  as  the  model  selection  errors. 
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Figure  SI.  Analytical  reversed-phase  HPLC  trace  of  GSH-induced  cyclization  of  [15N]-MCoTI-l 
precursor  before  and  after  being  purified  by  preparative  HPLC  and  by  affinity  chromatography 
using  trypsin-agarose  beads.  Identification  of  the  folded  cyclotide  was  carried  out  by  ES-MS. 
Expected  molecular  weight  for  the  [15N]-MCoTI-l  is  shown  in  parenthesis.  Natively  folded 
MCoTI-l  is  indicated  with  an  arrow. 
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Figure  S2.  Backbone  dynamics  of  free  and  trypsin  bound  MCoTI-l.  (A)  Spin-lattice  relaxation 
rate,  (s'1),  of  the  free  MCoTI-l  at  500  MHz  (black)  at  700  MHz  (red)  and  the  trypsin  bound 
MCoTI-l  at  500  MHz  (green)  and  700  MHz  (blue).  (B)  Spin-spin  relaxation  rate,  F^s'1),  of  the 
free  MCoTI-l  at  500  MHz  (black)  and  700  MHz  (red)  and  the  trypsin  bound  MCoTI-l  at  500  MHz 
(green)  and  700  MHz  (blue).  (C)  Heteronuclear  nuclear  Overhauser  effect,  ^N^HJ-NOE,  of  the 
free  MCoTI-l  (black)  and  the  trypsin  bound  MCoTI-l  at  700  MHz.  (D)  Rex  values  of  the  free  (red) 
and  trypsin  bound  MCoTI-l  from  the  Lipari-Szabo  analysis  are  shown.  Large  Rex  values 
(  >  0.1  s'1)  suggests  that  the  residue  undergoes  slow  ms-ps  motion.  The  position  of  MCoTI-l 
loops  are  indicated  by  arabic  numbers. 
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Table  SI.  Free  and  trypsin  bound  MCoTI-l  order  parameters  and  spectral  density  function 


(SDF)  models. 


Residue 

# 

Free 

MCoTI-l 

S2* 

Free 

MCoTI-l 

SDF-Model 

Free 

MCoTI-l 

x2 

Trypsin 

bound 

MCoTI-l 

S2* 

Trypsin 

bound 

MCoTI-l 

SDF-Model 

Trypsin 

bound 

MCoTI-l 

x2 

AS2 

1,  Val 

0.68 

LSa 

24.68 

0.34 

CLc 

6.23 

0.35 

2,  Cys 

0.79 

LS-exb 

4.92 

4,  Lys 

0.85 

LS-ex 

6.46 

0.23 

CL 

3.72 

0.61 

6,  Leu 

0.70 

r\  d 

CL-ex 

0.26 

0.75 

LS 

12.45 

-0.05 

8,  Arg 

0.88 

LS 

6.11 

9,  Cys 

0.80 

LS 

3.53 

0.23 

CL 

7.63 

0.57 

10,  Arg 

0.81 

LS 

6.33 

0.51 

CL-ex 

3.81 

0.31 

11,  Arg 

0.87 

LS 

25.49 

0.74 

LS 

5.27 

0.13 

12,  Asp 

0.89 

LS 

27.85 

0.79 

LS 

2.38 

0.10 

13,  Ser 

0.85 

LS 

23.67 

0.43 

CL 

3.62 

0.43 

14,  Asp 

0.89 

LS 

20.76 

0.62 

CL-ex 

2.44 

0.27 

15,  Cys 

0.82 

LS 

21.65 

0.81 

LS 

1.44 

0.01 

17,  Gly 

0.83 

LS 

6.98 

0.48 

CL-ex 

0.03 

0.35 

18,  Ala 

0.85 

LS 

17.91 

19,  Cys 

0.79 

LS 

8.24 

20,  lie 

0.88 

LS 

3.54 

0.77 

LS 

3.12 

0.11 

21,  Cys 

0.98 

LS 

13.31 

0.54 

CL 

2.53 

0.44 

22,  Arg 

0.86 

LS 

5.12 

0.18 

CL 

16.40 

0.68 

24,  Asn 

0.99 

LS 

10.22 

0.73 

LS 

9.00 

0.26 

25,  Gly 

0.92 

LS 

28.68 

0.75 

LS 

32.64 

0.17 

26,  Tyr 

0.93 

CL 

7.97 

0.78 

LS 

10.90 

0.15 

27,  Cys 

0.88 

LS-ex 

16.51 

0.79 

LS 

29.22 

0.09 

28,  Gly 

0.89 

LS-ex 

3.49 

0.75 

LS 

5.74 

0.14 

29,  Ser 

0.91 

LS 

9.45 

30,  Gly 

0.79 

LS 

39.23 

0.58 

CL-ex 

1.68 

0.21 

31,  Ser 

0.81 

LS 

7.09 

32,  Asp 

0.56 

LS 

13.41 

33,  Gly 

0.51 

LS 

6.85 

34,  Gly 

0.98 

LS 

4.99 

0.77 

LS 

4.20 

0.21 

a  LS  corresponds  to  the  original  Lipari-Szabo  SDF  model  [5,  1 0],  assuming  that  the  characteristic  correlation  time  for 
local  fluctuations,  tioc,  is  much  shorter  than  the  global  rotational  correlation  time,  t0,  [1 0]. 
bLS-ex  corresponds  to  the  Lipari-Szabo  SDF  model  supplemented  with  the  conformational  exchange  term  [10]. 
cCL  corresponds  to  the  extended  Lipari-Szabo  SDF  model  [1 1]. 

d  CL-ex  corresponds  to  the  extended  Lipari-Szabo  SDF  model  supplemented  with  conformational  exchange  term 
[11 


The  order  parameter  S2  was  calculated  by  minimizing  x2' 


[5],  which  includes  relaxation  data  set  collected  at  500 


MHz  and  700  MHz  frequencies  (Ri,  R2  and  NOE). 
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Where  the  sum  runs  over  all  the  residues  as  well  as  over  the  frequencies  (f  =  500  MHz  and  700  MHz) 
ori /_/,  aR2/,/and  ctnoe;,/  are  standard  deviations  in  RU/,  R2/,/and  NOE,, /for  the  rth  residue,  respectively, 

The  superscripts  refer  to  the  observed  (obs)  and  calculated  (cal)  [10,1 1]  values  of  the  relaxation  parameters. 


Table  S2.  Summary  of  the  'H  and  15N“-1H  NMR  assignments  for  the  backbone  protons  (i.e.  N“- 
H)  of  free  and  trypsin  bound  MCoTI-l. 


Trypsin  Trypsin 


Residue 

Free 

bound 

Free 

bound 

# 

SNH(PPm) 

SNH(PPm) 

A6NH(ppm) 

SN(ppm) 

SN(ppm) 

A6N(ppm) 

[(AS)2nh+0.04A62n)/2] 

1,  Val 

8.34 

8.57 

-0.23 

120.97 

124.80 

-3.83 

0.57 

2,  Cys 

8.54 

8.09 

0.45 

126.33 

126.41 

-0.08 

0.32 

4,  Lys 

8.08 

8.45 

-0.37 

120.51 

123.74 

-3.23 

0.53 

5,  lie 

7.52 

7.86 

-0.34 

119.94 

118.51 

1.43 

0.31 

6,  Leu 

8.52 

8.24 

0.28 

125.77 

126.23 

-0.46 

0.21 

8,  Arg 

8.62 

9.31 

-0.69 

127.56 

127.9 

-0.34 

0.49 

9,  Cys 

8.35 

8.38 

-0.03 

120.36 

121.92 

-1.56 

0.22 

10,  Arg 

7.97 

7.98 

-0.01 

117.13 

117.27 

-0.14 

0.02 

11,  Arg 

9.36 

9.37 

-0.01 

117.84 

117.83 

0.01 

0.01 

12,  Asp 

9.32 

9.24 

0.08 

120.97 

120.31 

0.66 

0.11 

13,  Ser 

8.28 

8.25 

0.03 

115.76 

115.57 

0.19 

0.03 

14,  Asp 

7.65 

7.63 

0.02 

120.59 

120.43 

0.16 

0.03 

15,  Cys 

7.97 

8.43 

-0.46 

117.74 

118.64 

-0.90 

0.35 

17,  Gly 

8.37 

8.45 

-0.08 

106.61 

106.26 

0.35 

0.08 

18,  Ala 

8.33 

7.88 

0.45 

125.08 

126.84 

-1.76 

0.41 

19,  Cys 

8.06 

8.04 

0.02 

117 

116.20 

0.80 

0.11 

20,  lie 

8.90 

9.09 

0.19 

113.3 

112.8 

-0.5 

0.01 

21,  Cys 

9.02 

9.0 

0.00 

124.19 

124.27 

-0.08 

0.01 

22,  Arg 

8.02 

7.95 

0.07 

128.56 

128.54 

0.02 

0.04 

23,  Gly 

8.79 

8.73 

0.06 

108.26 

107.79 

0.47 

0.08 

24,  Asn 

7.70 

7.88 

-0.18 

115.74 

115.76 

-0.02 

0.13 

25,  Gly 

8.30 

8.29 

0.01 

107.30 

107.23 

0.07 

0.01 

26,  Tyr 

7.19 

7.14 

0.05 

116.62 

116.39 

0.23 

0.05 

27,  Cys 

8.68 

8.48 

0.20 

120.71 

120.41 

0.30 

0.15 

28,  Gly 

9.72 

9.61 

0.11 

109.64 

108.66 

0.98 

0.16 

29,  Ser 

8.71 

8.69 

0.02 

115.75 

116.10 

-0.35 

0.05 

30,  Gly 

9.06 

9.06 

0.00 

111.77 

112.71 

-0.94 

0.13 

31,  Ser 

8.57 

8.72 

-0.15 

116.00 

116.40 

-0.40 

0.12 

32,  Asp 

8.31 

8.31 

0.00 

122.06 

122.05 

0.01 

0.01 

33,  Gly 

8.08 

8.04 

0.04 

108.41 

108.42 

-0.01 

0.02 

34,  Gly 

8.05 

8.02 

0.03 

110.78 

110.84 

-0.06 

0.02 
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ABSTRACT: 

Cyclotides  are  a  new  emerging  family  of  large  plant- 
derived  backbone-cyclized  polypeptides  (  k,30  amino  acids 
long)  that  share  a  disulfide-stabilized  core  (three  disulfide 
bonds)  characterized  by  an  unusual  knotted  structure. 

Their  unique  circular  backbone  topology  and  knotted 
arrangement  of  three  disulfide  bonds  make  them 
exceptionally  stable  to  thermal,  chemical,  and  enzymatic 
degradation  compared  to  other  peptides  of  similar  size. 
Currently,  more  than  100  sequences  of  different  cyclotides 
have  been  characterized,  and  the  number  is  expected  to 
increase  dramatically  in  the  coming  years.  Considering 
their  stability  and  biological  activities  like  anti-HIV, 
uterotonic,  and  insecticidal,  and  also  their  abilities  to 
cross  the  cell  membrane,  cyclotides  can  be  exploited  to 
develop  new  stable  peptide-based  drugs.  We  have  recently 
demonstrated  the  intriguing  possibility  of  producing 
libraries  of  cyclotides  inside  living  bacterial  cells.  This 
opens  the  possibility  to  generate  large  genetically  encoded 
libraries  of  cyclotides  that  can  then  be  screened  inside  the 
cell  for  selecting  particular  biological  activities  in  a  high- 
throughput  fashion.  The  present  minireview  reports  the 
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efforts  carried  out  toward  the  selection  of  cy clo tide-based 
compounds  with  specific  biological  activities  for 
drug  design.  ©  2010  Wiley  Periodicals,  Inc.  Biopolymers 
(Pept  Sci)  94:  611-616,  2010. 
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INTRODUCTION 

eptides  have  proven  to  be  valuable  and  effective 
drugs  when  targeting  protein-protein  interac¬ 
tions.1,2  The  concept  of  using  peptides  to  modulate 
intracellular  processes  has  been  investigated  for  deca¬ 
des,  as  peptides  play  a  central  role  in  every  cell  in  the 
body.  Polypeptides  that  mimic  protein  fragments  typically 
are  able  to  provide  effective  competitive-binding  antagonists 
for  protein-protein  interactions.  Targeting  these  interactions, 
which,  usually  involve  large  binding  surfaces,  has  proven 
challenging  for  small  molecules.  The  utility  of  peptide  thera¬ 
peutics,  however,  has  typically  been  limited  by  their  generally 
poor  stability  and  limited  bioavailability.  For  example,  linear 
peptides  are  typically  inherently  unstable  within  the  body 
and  are  rapidly  broken  down  into  inactive  fragments  by  pro¬ 
teolytic  enzymes,  which  are  then  filtered  from  the  blood 
stream  by  the  kidneys  within  minutes.  In  response  to  this 
challenge,  a  number  of  novel  polypeptide  scaffolds  are  start¬ 
ing  to  emerge  to  replace  classical  peptide  and  protein-based 
therapeutics.2-9 
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This  article  reviews  the  latest  developments  on  the  use  of 
cyclotides  as  a  molecular  scaffold  for  delivering  a  novel  type 
of  peptide-based  therapeutics  with  the  specificity  of  proteins 
but  the  physical-chemical  properties  that  are  usually  associ¬ 
ated  with  small  molecule-based  drugs. 

Cyclotides,  a  Novel  Ultrastable  Polypeptide  Scaffold 

Special  attention  has  been  recently  given  to  the  use  of  highly 
constrained  peptides,  also  known  as  micro-  or  miniproteins,  as 
extremely  stable  and  versatile  scaffolds  for  the  production  of 
high-affinity  ligands  for  specific  protein  capture  and/or  develop¬ 
ment  of  therapeutics.5,8  Cyclotides  are  fascinating  microproteins 
( « 30  amino  acids  long)  present  in  plants  from  the  Violaceae, 
Rubiaceae,  and  also  Cucurbitaceae;  and  featuring  various  bio¬ 
logical  actions  such  as  protease  inhibitory,  insecticidal,  cytotoxic, 
anti-HIV,  or  hormonelike  activity.10,11  More  recently,  it  has  been 
demonstrated  that  cyclotides  also  show  potent  anthelmintic  ac¬ 
tivity.12,13  Their  insecticidal  and  anthelmintic  properties  suggest 
that  they  may  function  as  defense  molecules  in  plants. 

Cyclotides  share  a  unique  head-to-tail  circular  knotted  to¬ 
pology  of  three  disulfide  bridges,  with  one  disulfide  penetrating 


through  a  macrocycle  formed  by  the  two  other  disulfides  and 
interconnecting  peptide  backbones,  forming  what  is  called  a 
cystine  knot  topology  (see  Figure  1).  These  microproteins  can 
be  considered  as  natural  combinatorial  peptide  libraries  struc¬ 
turally  constrained  by  the  cystine-knot  scaffold  and  head-to-tail 
cyclization  but  in  which  hypermutation  of  essentially  all  resi¬ 
dues  is  permitted  with  the  exception  of  the  strictly  conserved 
cysteines  that  comprise  the  knot.17'19  The  main  features  of 
cyclotides  are  therefore  a  remarkable  stability  due  to  the  cystine 
knot,  a  small  size  making  them  readily  accessible  to  chemical 
synthesis,  and  an  excellent  tolerance  to  sequence  variations.  For 
example,  the  first  cyclotide  to  be  discovered,  kalata  B1  (kBl),  is 
an  orally  effective  uterotonic.20  Intriguingly,  the  MCoTI-II 
cyclotide  has  also  been  shown  to  cross  the  cell  membrane 
through  macropinocytosis.21 

Using  the  Cyclotide  Scaffold  to  Introduce 
Novel  Biological  Activities 

There  have  been  a  number  of  reports  showing  the  plasticity 
of  the  cyclotide  framework  and  its  tolerance  to  substitution. 
The  first  proof  of  concept  for  grafting  of  new  sequences  onto 


MCoTI-II  kalata  B1 


kalata  B1 
kalata  B2 
kalata  B3 

MCoTI-I 

MCoTI-II 


G.LPVCGET. . .CVGGT.C. . .NTPGCTC. . .SWPVCTR. .N 
G.LFVCGET. . .CFGGT.C. . .NTPGCSC. . .TWPICTR. .D 
G.LPTCGET. . . CFGGT . C . . .NTPGCTCD . .PWPICTR. .D 

GG. . VCPKILQRCRRDSDC . . . .PGACIC. . RGNGYCGSGSD 
GG. . VCPKILKKCRRDSDC _ PGACIC. .RGNGYCGSGSD 


FIGURE  1  Primary  and  tertiary  structure  of  cyclotides  from  the  plants  Momordica  cochinchinen- 
sis  (MCoTI-II)  and  Oldenlandia  affinis  (kalata  Bl). 14-16  Red  and  blue  connectors  indicate  backbone 
and  disulfide  bonds,  respectively. 
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kB  1  was  established  by  replacing  some  hydrophobic  residues 
in  loop  5  with  charged  and  polar  residues.  These  residues 
form  part  of  a  surface-exposed  hydrophobic  patch  that  plays 
a  significant  role  in  the  folding  and  biological  activity  of  kBl. 
The  modified  cyclotide  analogs  retained  the  native  fold  and 
lacked  the  undesirable  hemolytic  activity  of  the  parent  pep¬ 
tide  despite  the  importance  of  these  residues.17 

Novel  VEGF-A  antagonists  have  also  been  developed  by 
grafting  a  peptide  sequence  able  to  antagonize  VEGF-A  onto 
kBl.22  One  of  the  grafted  analogs  showed  biological  activity 
at  low-micromolar  concentration  in  an  in  vitro  VEGF-A  an¬ 
tagonism  assay,  and  the  in  vitro  stability  of  the  target  epitope 
was  markedly  increased  using  this  approach. 

The  use  of  the  cyclotide  scaffold  in  drug  design  has  been 
also  recently  shown  by  engineering  non-native  activities  into 
the  cyclotide  MCoTI-II  (see  Figure  l).23  Replacing  the  PI  resi¬ 
due  in  the  active  loop  of  the  cyclotide  produced  several 
MCoTI-II  analogs  with  different  specificities  toward  alternative 
protease  targets.  Interestingly,  several  analogs  showed  selective 
low-/(M  inhibition  of  foot-and-mouth-disease  virus  3C  prote¬ 
ase.  These  are  the  first-reported  peptide-based  inhibitors  of 
this  protease  and  although  the  potency  was  relatively  low  (low 
/iM),  this  study  demonstrates  the  potential  of  using  MCoTI- 
based  cyclotides  for  designing  novel  protease  inhibitors.23 
Hence,  the  MCoTI-cyclotide  appears  to  be  a  versatile  scaffold 
for  the  display  of  biological  activities  as  it  has  also  been  shown 
to  be  able  to  cross  cellular  membranes.21  This  opens  the  possi¬ 
bility  of  using  it  for  delivering  biological  active  peptide  sequen¬ 
ces  to  intracellular  targets.  Several  grafting  studies  using  the 
MCoTI-cyclotide  scaffold  to  target  several  intracellular  targets 
involved  in  programmed  cell  death  and  in  tumor  cell  prolifera¬ 
tion  and  suppression  are  currently  underway  in  our  laboratory. 

MCoTI  cyclotides  share  a  high-sequence  homology  with 
related  cystine-knot  trypsin  inhibitors  found  in  squash  such  as 
EETI-II  ( Ecballium  elaterium  trypsin  inhibitor  II),  and,  in  fact, 
can  be  considered  cyclized  homologs  of  the  these  protease 
inhibitors.  Squash  cystine-knot  trypsin  inhibitors  have  also 
been  successfully  used  to  graft  biological  activities.  Thus,  the 
RGD  sequence,  originally  discovered  in  dis-integrins,  has  been 
grafted  into  loop  1  of  EETI-II  yielding  an  EETI-II  analog  with 
platelet  inhibitory  activity.24  Interestingly,  the  engineered  pro¬ 
teins  were  much  more  potent  in  inhibiting  platelet  aggregation 
than  the  grafted  peptides,  highlighting  the  importance  of 
grafting  a  linear  epitope  into  a  stable  peptide-scaffold.  These 
highly  stable  peptides  could  have  clinical  use  for  the  treatment 
of  patients  with  acute  coronary  syndrome,  for  example. 

In  addition  to  displaying  biological  activities,  EETI-II  peptides 
have  also  been  shown  to  permeate  through  rat  small  intestinal 
mucose  more  effectively  compared  to  other  peptide  drugs  such  as 
insulin  and  bacitracin 25  These  analogs  also  showed  comparable 


stability  to  cyclotides  in  plasma,  thus  supporting  evidence  for  the 
utility  of  highly  constrained  cystine-knot  peptides  in  drug  design. 

All  these  data  highlight  the  extraordinary  pharmacoki¬ 
netic  properties  of  cyclotides  and  cystine-knot  peptides  in 
general  thus  confirming  the  potential  of  these  polypeptide 
scaffolds  in  peptide-based  drug  discovery. 

Biosynthesis  of  Cyclotides 

Cyclotides  are  ribosomally  produced  in  plants  from  precur¬ 
sors  that  comprise  between  one  and  three  cyclotide  domains; 
however,  the  mechanism  of  excision  of  the  cyclotide  domains 
and  ligation  of  the  free  N-  and  C-termini  to  produce  the  cir¬ 
cular  peptides  has  not  been  completely  elucidated  yet.26,27 

Our  group  has  recently  developed  and  successfully  used  a 
biomimetic  approach  for  the  biosynthesis  of  folded  cyclotides 
inside  bacterial  cells  by  making  use  of  modified  protein  splicing 
units  in  combination  with  an  in-cell  intramolecular  native 
chemical  ligation  reaction  (NCL)  (see  Figure  2).16,28,29  Intra¬ 
molecular  NCL  requires  the  presence  of  an  N-terminal  Cys  res¬ 
idue  and  a  C-terminal  a-thioester  group  in  the  same  linear  pre¬ 
cursor  molecule.30-32  For  this  purpose,  the  linear  cyclotide  pre- 


Foided  Cyclotide 


FIGURE  2  Biosynthetic  approach  for  in  vivo  production  of  cyclo¬ 
tides  kalata  B1  and  MCoTI-II  inside  live  E.  coli  cells.18,29  Backbone 
cyclization  of  the  linear  precursor  is  mediated  by  a  modified  protein 
splicing  unit  or  intein.  The  cyclized  product  then  folds  spontane¬ 
ously  in  the  bacterial  cytoplasm. 
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FIGURE  3  Relative  affinities  for  trypsin  of  a  series  of  MCoTI-I  mutants  covering  all  the  loop  positions  except 
loop  6  and  Cys  residues.  A  model  of  cyclotide  MCoTI-I  bound  to  trypsin  is  shown  at  the  bottom  indicating  the  posi¬ 
tion  of  the  mutations.  The  side-chain  of  residue  K6  is  shown  in  red  bound  to  specificity  pocket  of  trypsin.  The  model 
was  produced  by  homology  modeling  at  the  Swiss  model  workspac^5  using  the  structure  of  CPTI-II-trypsin  complex 
(PDB  code:  2btc)36  as  template.  Structure  was  generated  using  the  PyMol  software  package.  Figure  adapted  from 
reference  16. 


cursors  were  fused  in  frame  at  their  C-  and  N-terminus  to  a 
modified  intein  and  a  Met  residue,  respectively.  This  allows  the 
generation  of  the  required  C-terminal  a-thioester  and  N-termi- 
nal  Cys  residue  after  in  vivo  processing  by  endogenous  Met 
amino  peptidase.  Our  group  has  also  recently  used  this  biomi- 
metic  approach  for  the  biosynthesis  of  another  backbone- 
cyclized  peptide,  the  Bowman-Birk  inhibitor  sunflower  trypsin 
inhibitor  l33;  and  biosynthesis  of  other  cyclic  peptides  such 
backbone-cyclized  a-defensins  and  naturally  occurring 
0-defensins  is  currently  underway  in  our  laboratory. 

Recombinant  biosynthesis  of  cyclic  polypeptides  offers 
many  advantages  over  purely  synthetic  methods.32  Using  the 
tools  of  molecular  biology,  large  combinatorial  libraries  of 


cyclic  peptides  may  be  generated  and  screened  in  vivo.  A  typ¬ 
ical  chemical  synthesis  may  generate  104  different  molecules. 
It  is  not  uncommon  for  a  recombinant  library  to  contain  as 
many  as  109  members.  The  molecular  diversity  generated  by 
this  approach  is  analogous  to  phage-display  technology.  The 
approach,  however,  differs  from  phage-display  in  that  the 
backbone-cyclized  polypeptides  are  not  fused  to  or  displayed 
by  any  viral  particle  or  protein,  but  remain  on  the  inside  of 
the  living  cell  where  they  can  be  further  screened  for  biologi¬ 
cal  activity  in  an  analogous  way  as  the  yeast  two  hybrid  tech¬ 
nology  works.34  The  complex  cellular  cytoplasm  provides  the 
appropriate  environment  to  address  the  physiological  rele¬ 
vance  of  potential  leads. 
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Screening  of  Cyclotide-Based  Libraries 

The  ability  to  create  cyclic  polypeptides  in  vivo  opens  up  the 
possibility  of  generating  large  libraries  of  cyclic  polypeptides. 
Using  the  tools  of  molecular  biology,  genetically  encoded 
libraries  of  cyclic  polypeptides  containing  billions  of  mem¬ 
bers  can  be  readily  generated.  This  tremendous  molecular  di¬ 
versity  forms  the  basis  for  selection  strategies  that  model  nat¬ 
ural  evolutionary  processes.  Also,  because  the  cyclic  polypep¬ 
tides  are  generated  inside  living  cells,  these  libraries  can  be 
directly  screened  for  their  ability  to  attenuate  or  inhibit  cellu¬ 
lar  processes. 

We  have  reported  recently  the  biosynthesis  of  a  genetically 
encoded  library  of  MCoTI-I  based  cyclotides  in  Escherichia 
coli  cells.19  The  cyclization/folding  of  the  library  was  per¬ 
formed  either  in  vitro,  by  incubation  with  a  redox  buffer  con¬ 
taining  glutathione,  or  by  in  vivo  self-processing  of  the  corre¬ 
sponding  precursor  proteins.  Of  27  mutations  studied,  only 
two  mutations,  G25P  and  I20G,  negatively  affected  the  fold¬ 
ing  of  cyclotides  (Figures  1  and  3).  These  data  provide  signif¬ 
icant  insights  into  the  structural  constraints  of  the  MCoTI 
cyclotide  framework  and  the  functional  elements  for  trypsin 
binding.  To  our  knowledge,  this  is  the  first  time  that  the  bio¬ 
synthesis  of  a  genetically  encoded  library  of  MCoTI-based 
cyclotides  containing  a  complete  suite  of  amino  acid  mutants 
is  reported.  Craik  and  coworkers  have  also  recently  reported 
the  chemical  synthesis  of  a  complete  suite  of  Ala  mutants  for 
kBl.18  These  mutants  were  fully  characterized  structurally 
and  functionally.  Their  results  indicated  that  only  two  of  the 
mutations  explored  (kBl  W20A  and  P21A,  both  located  in 
loop  5,  see  Figure  1)  prevented  folding.115  The  mutagenesis 
results  obtained  in  our  work  show  similar  results  highlight¬ 
ing  the  extreme  robustness  of  the  cyclotide  scaffold  to  muta¬ 
tions.  These  studies  show  that  cyclotides  may  provide  an 
ideal  scaffold  for  the  biosynthesis  of  large  combinatorial 
libraries  inside  living  bacterial  cells,  which  can  then  be 
screened  in-cell  for  biological  activity  using  high-throughput 
flow  cytometry  techniques.37-39 

Our  group  has  recently  developed  a  protease  cell-based 
screening  reporter  using  a  couple  of  optimized  genetically 
encoded  fluorescent  proteins  for  FRET-based  screening.  59  We 
are  currently  using  this  approach  for  screening  genetically 
encoded  libraries  based  on  the  MCoTI-I  cyclotide  scaffold 
against  several  proteases,  including  Anthrax  lethal  factor. 

CONCLUSION  AND  REMARKS 

In  summary,  cyclotides  have  several  characteristics  that  make 
them  ideal  drug  development  tools.  First,  they  are  remark¬ 
ably  stable  due  to  the  cystine  knot.  Second,  they  are  small, 
making  them  readily  accessible  to  chemical  synthesis  and 


therefore  chemical  lead  optimization.23'40-42  Third,  they  can 
be  encoded  within  standard  cloning  vectors,  expressed  in 
bacteria  or  animal  cells,  and  are  amenable  to  substantial 
sequence  variation.  ’  ’  Finally,  some  cyclotides  have  been 
shown  to  be  orally  bioavailable20  and  able  to  cross  the  cell 
membrane  through  macropinocytosis.21  These  characteristics 
make  them  ideal  substrates  for  molecular  evolution  strategies 
to  enable  generation  and  selection  of  compounds  with  opti¬ 
mal  binding  and  inhibitory  characteristics.  Cyclotides  thus 
appear  as  very  promising  leads  or  frameworks  for  the  devel¬ 
opment  of  novel  peptide-based  therapeutics  and  diagnos- 
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Introduction 

Cyclotides  are  fascinating  microproteins  that  are  present  in 
plants  from  Violaceae,  Rubiaceae  and  Cucurbitaceae  and  ex¬ 
hibit  various  biological  properties  such  as  protease  inhibitory, 
antimicrobial,  insecticidal,  cytotoxic,  anti-HIV  or  hormone-like 
activity.11,21  They  share  a  unique  head-to-tail  circular  knotted 
topology  of  three  disulfide  bridges,  with  one  disulfide  pene¬ 
trating  through  the  macrocycle  formed  by  the  two  other  disul¬ 
fides  and  interconnecting  peptide  backbones  to  form  what  is 
called  a  cystine  knot  topology  (Figure  1  A). 

Cyclotides  have  several  characteristics  that  make  them 
promising  leads  or  frameworks  for  peptide  drug  design.13,41  The 
cystine  knot  and  cyclic  backbone  topology  makes  them  excep¬ 
tionally  resistant  to  thermal,  chemical,  and  enzymatic  degrada¬ 
tion  compared  with  other  peptides  of  similar  size.151  Some  cy¬ 
clotides  have  been  shown  to  be  orally  bioavailable.  For  exam¬ 
ple,  the  first  cyclotide  to  be  discovered,  Kalata  B1,  was  found 
to  be  an  orally  effective  uterotonic,161  and  other  cyclotides  have 
been  shown  to  cross  the  cell  membrane  through  macropinocy- 
tosis.171  Moreover,  immunogenicity  is  generally  considered  not 
to  be  a  major  issue  for  small-sized  and  stable  microproteins.18,91 
Cyclotides  are  also  amenable  to  substantial  sequence  variation 
and  they  can  be  considered  as  natural  combinatorial  peptide 
libraries  structurally  constrained  by  the  cystine-knot  scaffold 
and  head-to-tail  cyclization.12,101  Cyclotides  can  also  be  chemi¬ 
cally  synthesized,  thus  allowing  the  introduction  of  specific 
chemical  modifications  or  biophysical  probes.1"-141  More  impor¬ 
tantly,  cyclotides  can  now  be  biosynthesized  in  E.  coli  cells 
through  a  biomimetic  approach  that  involves  the  use  of  modi¬ 
fied  protein  splicing  units115,161  (Figure  2).  This  therefore  makes 
them  ideal  scaffolds  for  molecular  evolution  strategies  to 
enable  the  generation  and  selection  of  compounds  with  opti¬ 
mal  binding  and  inhibitory  characteristics  against  particular 
molecular  targets. 

Investigation  of  the  contribution  of  individual  residues  to 
the  structural  integrity  and  biological  activities  of  particular  cy¬ 
clotides  is  crucial  for  their  use  in  any  potential  pharmaceutical 
application.1171  A  better  understanding  of  the  structural  limita¬ 
tions  of  the  cyclotide  scaffold,  so  that  sequence  modifications 
in  structurally  important  regions  are  avoided,  can  greatly  assist 
in  the  correct  design  of  cyclotide-based  libraries  for  molecular 
screening  and  the  selection  of  de  novo  sequences  with  new 
biological  activities  or  in  developing  grafted  analogues  for  use 
as  peptide-based  drugs114,181.  Understanding  the  molecular 
basis  for  bioactivity  might  also  allow  the  minimization  or 


avoidance  of  undesirable  properties,  such  as  cytotoxicity  or  he¬ 
molytic  activity,  that  are  found  in  some  cyclotides.1171 

The  cyclotides  MCoTI-l/ll  are  powerful  trypsin  inhibitors  that 
have  been  recently  isolated  from  the  dormant  seeds  of  Mo- 
mordica  cochinchinensis,  a  plant  member  of  Cucurbitaceae 
family.1191  Although  MCoTI  cyclotides  do  not  share  significant 
sequence  homology  with  other  cyclotides  beyond  the  pres¬ 
ence  of  the  three  cystine  bridges,  solution  NMR  has  shown 
that  they  adopt  a  similar  backbone-cyclic  cystine-knot  topolo¬ 
gy120'211  (Figure  1  A).  MCoTI  cyclotides,  however,  share  a  high  se¬ 
quence  homology  with  related  cystine-knot  trypsin  inhibitors 
found  in  squash  such  as  EETI,  and  it  is  likely  they  have  a  similar 
binding  to  that  of  the  EETI-family  (Figure  1  B).1191  Hence,  cyclic 
MCoTls  represent  interesting  candidates  for  drug  design,  either 
through  changing  their  specificity  of  inhibition  or  through 
using  their  structure  as  natural  scaffolds  possessing  new  bind¬ 
ing  activities. 

Results  and  Discussion 

In  the  current  study  we  report  the  biosynthesis  and  screening 
of  biological  activity  of  libraries  based  on  the  cyclotide  MCoTI- 
I.  These  libraries  were  designed  to  contain  multiple  MCoTI-l 
mutants,  in  which  all  the  residues  in  loops  1-5,  except  for  the 
Cys  residues  involved  in  the  cystine-knot,  were  replaced  by  dif¬ 
ferent  types  of  amino  acid.  These  mutations  included  the  intro¬ 
duction  of  neutral  (Ala),  flexible  and  small  (Gly),  hydrophilic 
(Ser  and  Thr),  hydrophobic  (Met  and  Val),  constrained  (Pro)  and 
aromatic  (Tyr  and  Trp)  residues  (see  Table  1).  The  only  residue 
in  loop  6  that  was  mutated  was  Vail.  This  residue  is  a  hydro- 
phobic  p-branched  amino  acid  highly  conserved  in  other 
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Figure  1.  A)  Primary  and  tertiary  structures  of  MCoTI  and  Kalata  cyclotides  isolated  from  Momordica  cochinchinensis  and  Oldenlandia  affinis,  respectively.16,20,211 
B)  Multiple  sequence  alignment  of  cyclotide  MCoTI-I  with  other  squash  trypsin  inhibitors.  Multiple  sequence  alignment  was  performed  by  using  TCoffee 
(http://ca.expasy.org/cgi-bin/hub)  and  visualized  by  using  Jalview.1311 


squash  trypsin  inhibitors  (STIs)  and  is  in  close  proximity  to  Lys4 
in  loop  1,  which  is  responsible  for  MCoTI's  ability  to  inhibit 
trypsin.  The  rest  of  the  residues  in  loop  6  are  not  required  for 
folding  or  biological  activity  in  linear  STIs1221  and  therefore  were 
not  explored.  It  is  believed  that  loop  6  acts  as  a  very  flexible 
linker  to  allow  cyclization.1231  To  our  knowledge  this  is  the  first 
time  that  a  cyclotide-based  library  has  been  biosynthesized  in 
£.  coli  cells,  and  a  complete  amino  acid  scanning  was  carried 


out  in  the  MCoTI-I  to  explore  the  effects  of  individual  amino 
acids  on  biological  activity  and  structural  requirements. 

The  biosynthesis  of  MCoTI-I  mutants  was  carried  out  by 
using  a  protein  splicing  unit  in  combination  with  an  in-cell  in¬ 
tramolecular  native  chemical  ligation  reaction  (NCL)  (Fig¬ 
ure  2).115,161  Intramolecular  NCL  requires  the  presence  of  an  N- 
terminal  Cys  residue  and  a  C-terminal  a-thioester  group  in  the 
same  linear  precursor  molecule.124,251  For  this  purpose,  the 
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Figure  2.  Biosynthetic  approach  to  the  production  of  cyclotide  MCoTI-l  libra¬ 
ries  inside  living  E.  coli  cells.  Backbone  cyclization  of  the  linear  precursor  is 
mediated  by  a  modified  protein  splicing  unit  or  intein.  The  cyclized  peptide 
then  folds  spontaneously  in  the  bacterial  cytoplasm. 


MCoTI-l  linear  precursors  were  fused  in  frame  at  their  C  and  N 
termini  to  a  modified  Mxe  gyrase  A  intein  and  a  Met  residue, 
respectively.  This  allows  the  generation  of  the  required  C-ter- 
minal  thioester  and  N-terminal  Cys  residue  after  in  vivo  proc¬ 
essing  by  endogenous  Met  aminopeptidase  (MAP).  We  used 
the  native  Cys  located  at  the  beginning  of  loop  6  to  facilitate 
the  cyclization.  This  linear  construct  has  been  shown  to  give 
very  good  expression  and  cyclization  yields  in  vivo.1161 

In  order  to  facilitate  the  analysis  and  processing  of  all  the 
mutants,  two  libraries  (Libl  and  Lib2)  were  produced  that  con¬ 
tained  13  and  15  different  MCoTI-l  mutants,  respectively  (see 
Table  1).  These  libraries  were  designed  to  contain  mutants  that 
could  be  easily  identified  by  ES-MS.  In  both  libraries,  the 
MCoTI-l  wild-type  (wt)  sequence  was  included  as  a  control. 
Synthetic  dsDNA  fragments  encoding  the  different  MCoTI-l 
mutants  were  ligated  into  plasmid  pTXBI  in  frame  with  Mxe 
gyrase  intein  (Table  SI  in  the  Supporting  Information).  The  re¬ 
sulting  plasmid  libraries  were  transformed  into  competent 
DH5a  £.  coli  cells  to  give  approximately  104  colonies  (data  not 
shown).  All  the  colonies  were  pooled,  and  the  corresponding 
plasmid  library  was  transformed  into  E.  coli  Origami2(DE3)  for 
protein  overexpression. 


Expression  of  the  library  in  E.  coli  produced  the  correspond¬ 
ing  MCoTI  mutant-gyrase  intein  linear  fusion  precursors  with 
similar  yields  to  that  of  the  wild-type  MCoTI-l.1161  The  level  of  in 
vivo  cleavage  was  estimated  to  be  Ri80%  following  induction 
for  20  h  at  20 °C  (Figure  SI).  These  expression  conditions  maxi¬ 
mize  the  in  vivo  processing  of  the  linear  intein  fusion  precur¬ 
sors  to  give  natively  folded  MCoTI  cyclotides.1161  In  vivo  cleav¬ 
age  and  processing  of  the  corresponding  intein  linear  fusion 
precursor  can  be  reduced  by  induction  at  slightly  elevated 
temperatures  for  short  times  (e.g.,  30 °C  for  2-4  h)  while  keep¬ 
ing  a  similar  level  of  protein  expression  (Figure  SI).  This  al¬ 
lowed  us  to  vary  the  amounts  of  folded  MCoTI  mutants  pro¬ 
duced  to  access  different  screening  methods.  For  in  vitro 
screening,  cyclization  can  be  accomplished  in  vitro  under  con¬ 
trolled  conditions,  and  therefore  short  induction  times  at  rela¬ 
tively  higher  temperatures  will  yield  more  uncleaved  linear  pre¬ 
cursor.  Alternatively,  in  vivo  cyclization  yields  can  be  easily 
maximized  by  using  longer  induction  times  and  lower  induc¬ 
tion  temperatures  (e.g.,  20 °C  for  20  h)  for  high  throughput  in 
vivo  screening. 

In  order  to  characterize  the  MCoTI-based  libraries  and  assess 
the  structural  integrity  of  the  MCoTI  mutants,  we  used  the  bio¬ 
logical  activity  of  MCoTI  to  bind  trypsin.  Purified  MCoTI 
mutant-gyrase  fusion  proteins  were  obtained  from  E.  coli 
Origami2(DE3)  cells  that  were  induced  at  30°C  for  4  h.  Under 
these  conditions,  only  r^30%  of  the  intein  linear  precursors 
were  processed  in  vivo.  The  fusion  precursors  were  cleaved 
and  cyclized  in  phosphate  buffer  (pH  7.2)  containing  50  itim 
glutathione  (GSH)  for  36  h.  In  our  hands,  GSH  has  been  shown 
to  be  more  effective  than  other  thiols  in  promoting  cyclization 
and  native-like  folding  of  cyclotides  and  other  disulfide-con¬ 
taining  peptides  in  vitro.115,16,261  This  treatment  resulted  in 
nearly  100%  cleavage  of  the  intein  precursors.  The  soluble 
fractions  were  purified  on  trypsin-sepharose  beads,  and  the 
bound  fractions  were  analyzed  by  FIPLC  and  ES-MS  to  deter¬ 
mine  the  relative  presence  of  the  library  members  able  to  bind 
trypsin  (Figure  3  A). 

As  anticipated,  the  MCoTI-K4A  mutant  was  not  found  in  the 
trypsin-bound  fraction.  This  residue  determines  binding  affinity 
and  specificity,  and  can  only  be  replaced  by  Arg  to  maintain 
biological  activity.1191  Analysis  of  the  cyclization  reaction  before 
affinity  purification  confirmed  the  presence  of  this  mutant  in 
the  corresponding  library  (Figure  S2).  The  K4A  mutant  was  also 
individually  cyclized,  purified  and  characterized  by  NMR,  and 
showed  a  native  cyclotide  topology  when  to  compared  to 
MCoTI-l  wt  (Figure  S3  and  Table  S2),  therefore  indicating  that 
the  lack  of  biological  activity  of  this  mutant  was  due  to  the 
replacement  of  Lys4  by  Ala,  and  not  to  the  adoption  of  a  non¬ 
native  fold. 

The  mutant  MCoTI-l  G25P  was  also  absent  in  the  trypsin- 
bound  fraction.  In  vitro  cyclization  of  this  mutant  revealed  that 
the  intein  precursor  of  this  cyclotide  was  not  processed  effi¬ 
ciently,  and  the  resulting  cyclotide  was  not  able  to  fold  proper¬ 
ly.  Only  traces  of  natively  folded  G25P  were  detected  in  the 
GSH-induced  cyclization/folding  of  the  corresponding  intein 
precursor  (Figure  S4).  The  inefficient  cleavage  of  this  mutant 
precursor  could  be  explained  by  the  proximity  of  a  Pro  residue 
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Table  1. 

Sequences  and  molecular  weights  found  for  the  different  MCoTI- 

mutants  used 

in  this  work. 

Name 

Sequence 

Molecular  weight  [Da] 
Expected  Found 

Libl 

wt 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNGY 

3480.94 

3480.4  ±0.6 

P3A 

CGSGSDGGVCAKILQRCRRDSDCPGACICRGNGY 

3454.91 

3454.2  ±0.3 

K4A 

CGSGSDGGVCPAILQRCRRDSDCPGACICRGNGY 

3423.85 

3423.7  ±0.6 

I5T 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNGY 

3468.89 

3468.0  ±0.1 

L6A 

CGSGSDGGVCPKIAQRCRRDSDCPGACICRGNGY 

3438.86 

3438.7  ±1.0 

Q7G 

CGSGSDGGVCPKILGRCRRDSDCPGACICRGNGY 

3409.87 

3408.5  ±1.2 

Q7M 

CGSGSDGGVCPKILMRCRRDSDCPGACICRGNGY 

3484.01 

3483.7  ±0.6 

R8A 

CGSGSDGGVCPKILQACRRDSDCPGACICRGNGY 

3395.84 

3396.0  ±0.1 

D12A 

CGSGSDGGVCPKILQRCRRASDCPGACICRGNGY 

3436.93 

3435.7  ±0.6 

S13A 

CGSGSDGGVCPKILQRCRRDADCPGACICRGNGY 

3464.95 

3465.0  ±1.0 

G17A 

CGSGSDGGVCPKILQRCRRDSDCPAACICRGNGY 

3494.97 

3495.0  ±1.0 

A18G 

CGSGSDGGVCPKILQRCRRDSDCPGGCICRGNGY 

3466.92 

3467.0  ±1.7 

Y26A 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNGA 

3388.85 

3388.3  ±1.5 

Lib2 

wt 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNGY 

3480.94 

3480.4  ±0.6 

VIA 

CGSGSDGGACPKILQRCRRDSDCPGACICRGNGY 

3452.89 

3453.0  ±1.0 

VIS 

CGSGSDGGSCPKILQRCRRDSDCPGACICRGNGY 

3468.89 

3468.7  ±1.2 

R10A 

CGSGSDGGVCPKILQRCARDSDCPGACICRGNGY 

3395.84 

3396.0  ±0.1 

R10G 

CGSGSDGGVCPKILQRCGRDSDCPGAC1CRGNGY 

3381.81 

3380.3  ±0.6 

R11V 

CGSGSDGGVCPKILQRCRVDSDCPGACICRGNGY 

3423.89 

3423.7  ±0.6 

DMA 

CGSGSDGGVCPKILQRCRRDSACPGACICRGNGY 

3436.93 

3436.7  ±1.2 

P16A 

CGSGSDGGVCPKILQRCRRDSDCAGACICRGNGY 

3454.91 

3455.0  ±1.7 

LOG 

CGSGSDGGVCPKILQRCRRDSDCPGACGCRGNGY 

3424.84 

3423.7  ±0.6 

R22W 

CGSGSDGGVCPKILQRCRRDSDCPGACICWGNGY 

3510.97 

3510.7±  1.2 

G23A 

CGSGSDGGVCPKILQRCRRDSDCPGACICRANGY 

3494.97 

3495.0  ±1.0 

N24W 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGWGY 

3553.05 

3552.7±  1.2 

G25P 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNPY 

3521.01 

3521. 0±  1.4 

Y26A 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNGA 

3388.44 

3388.3  ±1.5 

Y26W 

CGSGSDGGVCPKILQRCRRDSDCPGACICRGNGW 

3503.98 

3503.3  ±1.2 

to  the  MCoTI-intein  junction,  which  might  affect  the  ability  of 
the  gyrase  intein  to  produce  the  thioester  intermediate  re¬ 
quired  for  the  intramolecular  cyclization.  The  Gly25  residue  is 
located  on  loop  5  and  is  extremely  well  conserved  in  the  STIs 
(Figure  1  B),  thus  corroborating  the  importance  of  this  residue 
for  correct  folding  of  MCoTI  cyclotides. 

Remarkably,  the  remaining  mutants  were  identified  on  the 
trypsin-bound  fraction,  thus  indicating  that  they  were  able  to 
adopt  a  native-like  structure  whose  ability  to  bind  trypsin  was 
not  significantly  disrupted.  All  the  active  mutants  besides  I20G 
were  produced  with  similar  yields  to  the  MCoTI  wt  (within 
50%  of  the  average  value),  as  quantified  by  HPLC  and  ES-MS. 
The  folded  I20G  mutant  abundance  was  estimated  to  be 
r;10%  of  the  average.  Cleavage  and  cyclization  of  I20G  with 
GSH  revealed  that,  although  the  thiol-induced  cleavage  was 
very  efficient,  the  correctly  folded  mutant  was  produced  in 
very  low  yield  (Figure  S5);  this  indicated  the  importance  of  this 
residue  for  efficient  folding  in  MCoTI  cyclotides.  In  fact,  this 
residue,  which  is  located  between  the  Cys  residues  at  the  end 
and  beginning  of  loops  3  and  5,  respectively,  is  well  conserved 
among  the  different  linear  STIs  and  cyclotides;  this  shows  a 
preference  for  |3-branched  residues  and  hydrophilic  residues  at 
this  position.  Interestingly,  the  correctly  folded  I20G  mutant 
was  able  to  bind  trypsin  beads,  thus  confirming  the  mutant's 
ability  to  adopt  a  native  folded  structure. 


Next,  we  screened  the  biologi¬ 
cal  activity  of  the  MCoTI-l  libra¬ 
ries  produced  in  vivo.  For  this 
purpose  both  libraries  (Libl  and 
Lib2)  were  expressed  in  £  coli 
Origami2(DE3)  cells  at  20  °C  for 
20  h  in  order  to  maximize  the 
intracellular  processing  and  fold¬ 
ing  of  the  different  intein  precur¬ 
sors.  After  the  cells  had  been 
lysed  by  sonication,  the  cellular 
supernatant  was  purified  by 
using  trypsin-sepharose  under 
competing  binding  conditions, 
as  described  above.  The  different 
fractions  were  then  analyzed 
and  quantified  by  HPLC  and  ES- 
MS.  The  results  obtained  were 
very  similar  to  those  found  with 
in  vitro  cyclized  libraries  (data 
not  shown),  thus  indicating  that 
the  composition  of  the  libraries 
obtained  in  vitro  and  in  vivo 
were  practically  identical. 

In  order  to  establish  the  rela¬ 
tive  affinities  of  the  different  mu¬ 
tants  that  are  able  to  bind  tryp¬ 
sin  versus  MCoTI-l  wild  type,  in 
vitro  and  in  vivo  cyclized  libra¬ 
ries  were  incubated  with  tryp¬ 
sin-sepharose  under  competing 
conditions,  that  is,  using  only 
&20%  of  the  required  trypsin-sepharose  beads  for  stoichio¬ 
metric  binding.  This  process  ensured  that  cyclotides  with  tight¬ 
er  affinities  competed  for  binding  to  trypsin  leaving  the  mem¬ 
bers  of  the  library  with  weaker  affinities  in  the  supernatant 
(i.e.,  unbound  fraction).  This  supernatant  was  then  purified 
again  by  using  the  same  approach  to  extract  the  remaining 
active  cyclotides.  This  process  was  repeated  several  times  until 
all  the  active  cyclotides  found  in  a  particular  library  sample 
were  completed  extracted.  This  process  ensured  that  cyclo¬ 
tides  with  tighter  affinities  for  trypsin  were  extracted  during 
the  first  affinity  purifications  leaving  the  library  members  with 
weaker  affinities  to  be  purified  later  on  in  this  sequential  ex¬ 
traction  process.  All  the  different  trypsin-bound  fractions  were 
then  analyzed  and  quantified  by  FIPLC  and  ES-MS  (Figure  3  B). 
The  results,  summarized  in  Figure  4,  show  that  with  MCoTI-l  wt 
as  internal  reference,  the  mutants  N24W  and  R22W  were  con¬ 
sistently  able  to  compete  slightly  with  the  rest  of  mutants  (in¬ 
cluding  wt),  thus  indicating  a  somewhat  tighter  affinity  for 
trypsin  than  the  wt  sequence.  Most  of  the  remaining  mutants: 
P3A,  Q7M,  R8A,  R10A,  R10G,  R11V,  D12A,  S13A,  DMA,  P16A, 
G17A,  A18G,  G23A,  Y26A  and  Y26W  showed  similar  elution 
profiles  to  that  of  wt  indicating  a  similar  affinity  for  trypsin. 
Mutants  VIA,  VIS,  I5T,  L6A  and  Q7G,  on  the  other  hand,  were 
consistently  extracted  after  MCoTI  wt;  this  indicates  a  lower  af¬ 
finity  for  trypsin  than  the  wt  sequence.  I20G  (not  shown)  was 
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A)  MCoTI-Libl  MCoTI-Lib2 


Figure  3.  Analytical  reversed-phase  HPLC  traces  of  trypsin-bound  fractions  from  MCoTI  Libl  (left)  and  Lib2  (right)  libraries  obtained  in  vitro  by  GSH-induced 
cleavage  and  folding.  A)  Total  trypsin-bound  fractions.  The  position  where  mutant  K4A  should  be  eluting  is  shown  in  red.  B)  Sequential  fractions  extracted 
during  competitive  trypsin-binding  experiments  (see  text  for  detailed  description). 


also  extracted  after  MCoTI-l  wt;  however,  this  could  be  due  to 
the  low  abundance  of  folded  cyclotide. 

Although  there  is  no  structure  available  for  the  complex  be¬ 
tween  MCoTI  cyclotides  and  trypsin,  the  structure  of  several 
complexes  formed  between  different  STIs  and  trypsin  have 
been  reported  so  far.127-281  Based  on  the  high  sequence  homol¬ 
ogy  between  these  trypsin  inhibitors  and  MCoTI  cyclotides 
(Figure  1),  it  is  reasonable  to  assume  that  they  possess  the 


same  binding  mode  to  trypsin.1191  Therefore,  it  is  not  surprising 
that  mutant  K4A  was  not  able  to  bind  trypsin,  since  K4  is  criti¬ 
cal  for  binding  to  the  trypsin  specificity  pocket.1191  Other  muta¬ 
tions  in  loop  1  also  negatively  affected  trypsin  binding.  Hence, 
mutants  I5T,  L6A  and  Q7G  were  consistently  eluted  in  the  later 
fractions  in  our  competing  binding  experiments,  thus  indicat¬ 
ing  a  weaker  affinity  for  trypsin.  The  sequence  Lys/Arg-lle-Leu 
in  loop  1  is  extremely  well  conserved  in  all  linear  STIs;  this  sug- 
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MCoTI-Libl  MCoTI-Lib2 


Figure  4.  Elution  profiles  for  members  of  the  MCoTI  Libl  (left)  and  Lib2  (right)  extracted  by  using  trypsin-sepharose  beads  under  competing  conditions.  The 
results  shown  are  the  average  data  obtained  in  vivo  and  in  vitro  (vertical  bars  indicate  standard  deviation).  Quantification  was  done  by  integration  of  the  cor¬ 
responding  HPLC  peaks  monitored  at  220  nm  (Figure  3B).  ES-MS  was  used  to  calculate  the  ratio  of  different  cyclotides  present  in  peaks  with  multiple  prod¬ 
ucts  or  not  well  resolved  by  HPLC.  The  area  estimated  for  mutants  with  loss  or  gain  of  aromatic  residues  was  corrected  accordingly  to  take  this  into  ac¬ 
count?21 


gests  that  it  is  required  for  efficient  trypsin  binding.  Position  7, 
on  the  other  hand,  seems  to  be  more  promiscuous  and  it  is 
able  to  accept  hydrophobic  residues  (Q7M  showed  a  similar 
elution  pattern  to  MCoTI-wt)  and  positively  charged  residues 
(cyclotide  MCoTI-ll  has  a  Lys  residue  in  this  position),  but  not  a 
small  and  flexible  residue  like  Gly  (mutant  Q7G  shows  weaker 
trypsin  affinity  than  wt).  Also  in  loop  1,  mutation  R8A  did  not 
significantly  affect  trypsin  binding,  and  the  corresponding 
mutant  showed  an  elution  pattern  similar  to  that  of  wt.  In 
agreement  with  this  result,  this  position  is  not  especially  well 
conserved  in  linear  STIs  allowing  the  presence  of  charged 
(both  positively  and  negatively)  and  Pro  residues. 

All  the  mutations  explored  in  loop  2  had  similar  elution  pat¬ 
terns  to  the  wt  sequence.  This  should  be  expected  since  this 
loop  is  solvent  exposed  and  on  the  opposite  side  to  loop  1. 
The  only  mutation  affecting  trypsin  binding  in  loop  3  was  rep¬ 
resented  by  mutant  P16A.  The  rest  of  the  mutants  in  this  loop 
behaved  similarly  to  the  wt  sequence.  This  loop  is  partially  ex¬ 
posed  in  the  structure  of  several  linear  STIs  with  trypsin  and  it 
shows  significant  sequence  heterogeneity  among  the  different 
STIs.  Position  16,  however,  is  usually  occupied  in  other  STIs  by 
hydrophobic  residues  (mainly  Leu  and  Met);  this  could  explain 
the  observed  behavior  of  mutant  P16A. 

None  of  the  mutations  in  loop  5,  besides  G25P,  had  an  ad¬ 
verse  effect  on  trypsin  binding.  It  is  interesting  to  remark  that 


mutants  Y26A  and  Y26W  showed  a  similar  elution  profile  to 
the  wt  sequence  (Figure  4).  This  position  is  very  well  conserved 
among  different  STIs,  being  occupied  mainly  by  either  aromat¬ 
ic  (Tyr  or  Phe)  or  in  some  cases  lie  and  His.  Analysis  of  the 
structure  of  linear  Cucurbita  pepo  trypsin  inhibitor-ll  (CPTI-II, 
which  shares  ^75%  sequence  homology  with  MCoTI-l)  com- 
plexed  with  bovine  trypsin1271  shows  that  this  position  makes  a 
direct  contact  with  the  trypsin  Tyrl  5 1  residue,  which  is  highly 
conserved  among  different  trypsin  homologues.  Intriguingly, 
mutants  R22W  and  N24W  seemed  to  slightly  outcompete  the 
rest  of  the  library  members  including  MCoTI-l  wt  in  our  bind¬ 
ing  competing  experiments  (Figures  3B  and  4).  These  residues 
are  in  close  proximity  to  Tyr26  and  they  could  help  to  further 
stabilize  the  aromatic  interaction  described  before  between 
the  MCoTI-l  mutants  and  trypsin. 

The  position  corresponding  to  Vail  at  the  end  of  loop  6  was 
also  explored  by  including  mutants  VIA  and  VIS  in  this  study. 
This  position  favors  the  presence  of  hydrophobic  residues 
(mainly  Val,  Met  and  lie)  among  different  STIs,  although  hydro¬ 
philic  and  charged  residues  are  also  found  in  some  STIs.  Visual 
inspection  of  the  CPTI-II— trypsin  complex,1271  for  example,  re¬ 
veals  that  this  position  is  in  close  proximity  to  the  trypsin 
Trp215.  This  aromatic  residue  is  highly  conserved  among  the 
different  trypsin  homologues,  thus  indicating  that  this  interac¬ 
tion  might  be  important  to  the  stabilization  of  the  complex. 
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Consistent  with  this  finding,  replacement  of  Vail  in  MCoTI-l  by 
Ser  and  Ala  produced  mutants  that  consistently  showed  a 
weaker  affinity  than  MCoTI-l  wt  (Figure  5). 


Mutation  type 

Neutral 

Small 

Hydrophilic 

Hydrophobic 

Constrained 
Aromatic 
Mutation  type 


Loop  1  Loop  2 

ILQRCRRDSDCPGACICRGNGYCGSGSDGG 


Loop  3  Loop  5 

20 


vcp: 


ooo  eo 


rapid  screening  and  selection  of  de  novo  cyclotide  sequences 
with  specific  biological  activities. 

The  libraries  used  in  this  work  were  produced  either  in  vitro 
by  GSFI-induced  cyclization/fold- 
ing  or  by  in  vivo  self-processing 
of  the  corresponding  precursor 
proteins.  In  both  cases  the  re¬ 
sults  were  similar;  this  indicates 
that  this  approach  is  quite  gen¬ 
eral  for  the  production  of  com¬ 
plex  libraries.  Importantly,  the  in 
vivo  biosynthesis  of  cyclotide- 
based  libraries  could  have  tre¬ 
mendous  potential  for  drug  dis¬ 
covery.  This  study  shows  that 
MCoTI  cyclotides  could  provide 
an  ideal  scaffold  for  the  biosyn¬ 
thesis  of  large  combinatorial  li¬ 
braries  inside  living  E.  coli  cells. 
Coupled  to  an  appropriate  in 
vivo  reporter  system,  this  library 
could  rapidly  be  screened  by 
using  high-throughput  technolo¬ 
gies  such  as  fluorescence-activat¬ 
ed  cell  sorting.129,301 

See  the  Supporting  Informa¬ 
tion  for  experimental  details. 


ILQRCRRDSDCPGACICRGNGYCGSGSDGG 


Figure  5.  Summary  of  the  relative  affinities  for  trypsin  of  the  different  MCoTI-l  mutants  studied  in  this  work.  A 
model  of  cyclotide  MCoTI-l  bound  to  trypsin  is  shown  at  the  bottom  and  indicates  the  positions  of  the  mutations. 
The  Lys4  side  chain  is  shown  in  red  bound  to  the  specificity  pocket  of  trypsin.  The  model  was  produced  by  ho¬ 
mology  modeling  at  the  Swiss  model  workspace  (http://swissmodel.expasy.org//SWISS-MODEL.html)[33]  by  using 
the  structure  of  CPTI-ll-trypsin  complex  (PDB  ID:  2btc)[27]  as  the  template.  The  structure  was  generated  by  using 
the  PyMol  software  package. 
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In  summary,  these  data  provide  significant  insights  into  the 
structural  constraints  of  the  MCoTI  cyclotide  framework  and 
the  functional  elements  for  trypsin  binding.  To  our  knowledge, 
this  is  the  first  time  that  the  biosynthesis  of  a  genetically  en¬ 
coded  library  of  MCoTI-based  cyclotides  containing  a  compre¬ 
hensive  suite  of  amino  acid  mutants  has  been  reported.  Craik 
and  co-workers  have  also  recently  reported  the  chemical  syn¬ 
thesis  of  a  complete  suite  of  Ala  mutants  for  the  cyclotide  Ka- 
lata  B1  (KB-1  ).[17]  These  mutants  were  fully  characterized  struc¬ 
turally  and  functionally.  Their  results  indicated  that  only  two  of 
the  mutations  explored  (KB-1  W20A  and  P21A,  both  located  in 
loop  5,  see  Figure  1)  prevented  folding.1171  The  mutagenesis  re¬ 
sults  obtained  in  our  work  show  similar  results,  thus  highlight¬ 
ing  the  extreme  robustness  of  the  cyclotide  scaffold  to  muta¬ 
tions.  Only  two  of  the  27  mutations  studied  in  cyclotide 
MCoTI-l,  G25P  and  I20G,  negatively  affected  the  adoption  of  a 
native  cyclotide  fold.  Intriguingly,  the  rest  of  the  mutations  al¬ 
lowed  the  adoption  of  a  native  fold,  as  indicated  by  ES-MS 
analysis  and  their  ability  to  bind  trypsin  (or  NMR  in  the  case  of 
K4A).  These  results  should  provide  an  excellent  starting  point 
for  the  effective  design  of  MCoTI-based  cyclotide  libraries  for 


National  Laboratory. 
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Supplemental  Information 


General  materials  and  methods 

Analytical  HPLC  was  performed  on  a  HP1 1 00  series  instrument  with  220  nm  and  280 
nm  detection  using  a  Vydac  Cl  8  column  (5  pm,  4.6  x  1 50  mm)  at  a  flow  rate  of  1 
mL/min.  Semipreparative  HPLC  was  performed  on  a  Waters  Delta  Prep  system  fitted 
with  a  Waters  2487  Ultraviolet-Visible  (UV-vis)  detector  using  a  Vydac  Cl  8  column  (15- 
20  pm,  10  x  250  mm)  at  a  flow  rate  of  5  mL/min.  All  runs  used  linear  gradients  of  0.1% 
aqueous  trifluoroacetic  acid  (TFA,  solvent  A)  vs.  0.1%  TFA,  90%  acetonitrile  in  H20 
(solvent  B).  UV-vis  spectroscopy  was  carried  out  on  an  Agilent  8453  diode  array 
spectrophotometer,  and  fluorescence  analysis  on  a  Jobin  Yvon  Flurolog-3 
spectroflurometer.  Electrospray  mass  spectrometry  (ES-MS)  analysis  was  routinely 
applied  to  all  cyclized  peptides.  ES-MS  was  performed  on  a  Sciex  API-1 50EX  single 
quadrupole  electrospray  mass  spectrometer,  MS/MS  was  performed  on  an  Applied 
Biosystems  API  3000  triple  quadrupole  mass  spectrometer.  Calculated  masses  were 
obtained  by  using  ProMac  vl  .5.3.  Protein  samples  were  analyzed  by  SDS-PAGE. 
Samples  were  run  on  Invitrogen  (Carlsbad,  CA)  4-20%  Tris-Glycine  Gels.  The  gels  were 
then  stained  with  Pierce  (Rockford,  IL)  Gelcode  Blue,  photographed/digitized  using  a 
Kodak  (Rochester,  NY)  EDAS  290,  and  quantified  using  NIH  Image-J  software 
(http://rsb.info.nih.gov/ij/).  DNA  sequencing  was  performed  by  Davis  Sequencing  (Davis, 
CA)  or  DNA  Sequencing  and  Genetic  Analysis  Core  Facility  at  the  Unviersity  of  Southern 
California  using  an  ABI  3730  DNA  sequencer,  and  the  sequence  data  was  analyzed  with 
DNAStar  (Madison,  Wl)  Lasergene  v5.5.2.  All  chemicals  were  obtained  from  Sigma- 
Aldrich  (Milwaukee,  Wl)  unless  otherwise  indicated. 


Construction  of  expression  plasmids 

Plasmids  expressing  the  MCoTI-l  precursors  were  constructed  using  the  pTXBI 
expression  plasmids  (New  England  Biolabs),  which  contain  an  engineered  Mxe  Gyrase 
intein,  respectively,  and  a  chitin-binding  domain  (CBD).  Oligonucleotides  coding  for  the 
MCoTI-l  wild  type  and  mutant  sequences  (Table  SI )  were  synthesized,  phosphorylated 
and  PAGE  purified  by  IDT  DNA  (Coralville,  IA).  Complementary  strands  were  annealed 
in  0.3  M  NaCI  and  the  resulting  double  stranded  DNA  (dsDNA)  was  purified  using 
Qiagen’s  (Valencia,  CA)  miniprep  column  and  buffer  PN.  pTXBI  plasmids  was  double 
digested  with  Ndel  and  Sapl  (NEB).  The  linearized  vectors  and  the  MCoTI-l  encoding 
dsDNA  fragments  were  ligated  at  1 5  cO  overnight  using  T4  DNA  Ligase  (New  England 
Biolabs).  The  ligated  plasmids  were  transformed  into  DH5a  cells  (Invitrogen)  and  plated 
on  Luria  Broth  (LB)-agar  containing  ampicillin.  Positive  colonies  were  grown  in  5  ml_  LB 
containing  ampicillin  at  37 TO  overnight  and  the  corresponding  plasmids  purified  using  a 
Miniprep  Kit  (Qiagen).  Plasmids  were  initially  screened  by  EcoRI  digestion,  as  this 
restriction  site  is  removed  during  cloning.  Preliminary  positives  were  expressed  (see 
below)  and  fully  characterized  by  ES-MS. 

Expression  and  purification  of  recombinant  proteins 

Origami(DE3)  or  Origami2(DE3)  cells  (Novagen,  San  Diego,  CA)  were  transformed  with 
the  MCoTI-l  plasmids  (see  above).  Expression  was  carried  out  in  LB  medium  (1-2  L) 
containing  ampicillin  at  room  temperature  or  30°C  for  2  h  or  overnight  (20  h), 
respectively.  Briefly,  5  mL  of  an  overnight  starter  culture  derived  from  either  a  single 
clone  or  single  plate  (Ala-scan  library)  were  used  to  inoculate  1  L  of  LB  media.  Cells 
were  grown  to  an  OD  at  600  nm  of  0.5  at  37 °C,  and  expression  was  induced  by  the 
addition  of  isopropyl-(3  -D-thiogalactopyranoside  (IPTG)  to  a  final  concentration  of  0.3 
mM  at  the  temperatures  and  times  mentioned  above  and  in  the  manuscript.  The  cells 


were  then  harvested  by  centrifugation.  For  fusion  protein  purification,  the  cells  were 
resuspended  in  30  mL  of  lysis  buffer  (0.1  mM  EDTA,  1  mM  PMSF,  50  mM  sodium 
phosphate,  250  mM  NaCI  buffer  at  pH  7.2  containing  5%  glycerol)  and  lysed  by 
sonication.  The  lysate  was  clarified  by  centrifugation  at  1 5,000  rpm  in  a  Sorval  SS-34 
rotor  for  30  min.  The  clarified  supernatant  was  incubated  with  chitin-beads  (2  mL 
beads/L  cells,  New  England  Biolabs),  previously  equilibrated  with  column  buffer  (0.1  mM 
EDTA,  50  mM  sodium  phosphate,  250  mM  NaCI  buffer  at  pH  7.2)  at  4qC  for  1  h  with 
gentle  rocking.  The  beads  were  extensively  washed  with  50  bead-volumes  of  column 
buffer  containing  0.1%  Triton  XI 00  and  then  rinsed  and  equilibrated  with  50  bead- 
volumes  of  column  buffer.  In-vivo  cleavage  was  quantified  by  SDS-PAGE  analysis  of  the 
purified  fusion  proteins  using  the  NIH  Image-J  software  package. 

Concomitant  cleavage,  cyclization  and  folding  of  MCoTI-l  cyclotides  with  GSH. 
Purified  MCoTI-Intein-CBD  fusion  proteins  were  cleaved  with  50  mM  GSH  in  degassed 
column  buffer.  The  cyclization/folding  reactions  were  kept  for  up  to  2  days  at  25^0  with 
gentle  rocking.  For  small  scale  reactions,  aliquots  was  taken  each  day  (when  necessary) 
and  analyzed  by  HPLC.  The  reduced  and  oxidized  circular  MCoTI-l  cyclotides  were 
analyzed  by  ES-MS  (Table  2).  The  supernatant  of  the  cyclization  reaction  was  separated 
by  filtration  and  the  beads  were  washed  with  additional  column  buffer  (1  column  volume 
per  each  mL  of  beads).  The  supernatant  and  washes  were  pooled,  and  the  oxidized- 
circular  peptides  were  typically  purified  by  semipreparative  HPLC  using  a  linear  gradient 
of  1 5-45%  solvent  B  over  30  min. 

Purification  of  MCoTI-l  based  libraries  using  trypsin-sepharose  beads. 

Preparation  of  trypsin-sepharose  beads:  NHS-activated  Sepharose  was  washed  with  15 
volumes  of  ice-cold  1  mM  HCI.  Each  volume  of  beads  was  incubated  with  an  equal 
volume  of  coupling  buffer  (50  mM  NaCI,  200  mM  sodium  phosphate  buffer  at  pH  6.0) 
containing  2  mg  of  Porcine  Pancreas  Trypsin  type  IX-S  (14,000  units/mg)  for  3  h  with 


gentle  rocking  at  room  temperature.  The  beads  were  then  rinsed  with  10  volumes  of 
coupling  buffer,  and  incubated  with  excess  coupling  buffer  containing  100  mM 
ethanolamine  (Eastman  Kodak)  for  3  h  with  gentle  rocking  at  room  temperature.  Finally, 
the  beads  were  washed  with  50  volumes  of  wash  buffer  (200  mM  sodium  acetate  buffer 
at  pH  3,  250  mM  NaCI)  and  stored  in  one  volume  of  wash  buffer. 

=30  ml_  of  clarified  lysates  (in  vivo  obtained  libraries)  or  1 0  ml_  of  GSH-induced 
cyclization/folding  reaction  mixture  (in  vitro  obtained  libraries)  were  typically  incubated 
with  1.0  mL  of  trypsin-sepharose  for  one  hour  at  room  temperature  with  gentle  rocking, 
and  centrifuged  at  3000  rpm  for  1  min.  The  beads  were  washed  with  50  volumes  of  PBS 
containing  0.1%  Triton  XI 00,  then  rinsed  with  50  volumes  of  PBS,  and  drained  of  excess 
PBS.  Bound  peptides  were  eluted  with  2.0  mL  of  8  M  GdmHCI  and  fractions  were 
analyzed  by  RP-HPLC  and  ES-MS/MS. 

Competing  trypsin-binding  experiments 

Competing  binding  experiments  were  performed  as  described  above  but  using  0.2  mL  of 
trypsin-sepharose  beads  instead.  For  every  library  sample,  up  to  6  sequential 
extractions  were  performed.  All  competing  trypsin-binding  experiments  were  performed 
by  duplicate. 

Recombinant  expression  of  15N-labeled  MCoTI-l  wt  and  K4A 

15N-labeled  cyclotides  were  produced  by  GSH-induced  cleavage  of  the  intein  precursors 
in  vitro  as  described  above.  Expression  of  intein  precursors  was  accomplished  as 
described  above  but  growing  the  cells  in  M9  minimal  medium  containing  0.1%  15NH4CI. 
Folded  cyclotides  were  purified  by  semipreparative  HPLC  using  a  linear  gradient  of  1 5- 
45%  solvent  B  over  30  min.  The  isolated  yield  for  both  purified  MCoTI  cyclotides  was 
around  0.3  mg/L.  Purified  products  were  characterized  by  HPLC  and  ES-MS  (Fig.  S5) 
and  2D-NMR  (Figure  S2  and  Table  S2).  Total  yield  was  «0.5  mg  of  folded  cyclotide  per 


liter  of  bacterial  culture. 


NMR  characterization  of  MCoTI-l  wt  and  K4A  mutant 


NMR  samples  were  prepared  by  dissolving  either  [U-,  15N]  MCoTI-l  or  [U-,  15N]  K4A 
MCoTI-l  into  90%  H2O/10%  2H20  (v/v)  or  100%  D20  to  a  concentration  of  approximately 
0.2  mM  with  the  pH  adjusted  to  3.4  by  addition  of  dilute  HCI.  All  'H  NMR  data  were 
recorded  on  Bruker  Avance  II  700  MHz  spectrometer  equipped  with  a  cryoprobe.  Data 
were  acquired  at  27  °C,  and  2,2-dimethyl-2-silapentane-5-sulfonate,  DSS,  was  used  as 
an  internal  reference.  All  3D  experiments,  ^{^Nj-TOCSY-HSQC  and  ^{^NJ-NOESY, 
were  performed  according  to  standard  procedures111  with  spectral  widths  of  12  ppm  in 
proton  dimensions  and  35  ppm  in  nitrogen  dimension.  The  carrier  frequency  was 
centered  on  the  water  signal,  and  the  solvent  was  suppressed  by  using  WATERGATE 
pulse  sequence.  TOCSY  (spin  lock  time  80  ms)  and  NOESY  (mixing  time  150  ms) 
spectra  were  collected  using  1 024  t3  points,  256  t2  anf  128  ti  blocks  of  1 6  transients. 
Spectra  were  processed  using  Topspin  1.3  (Bruker).  Each  3D-data  set  was  apodized  by 
90°-shifted  sinebell-squared  in  all  dimensions,  and  zero  filled  to  1024  x  512  x  256  points 
prior  to  Fourier  transformation.  Assigments  for  the  backbone  nitrogens,  H“  and  H’ 
protons  (Figure  S2  and  Table  S2  and  S3)  of  folded  MCoTI-l  wt  and  mutant  K4A  were 
obtained  using  standard  procedures. [1, 21 
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Figure  SO.  Analysis  by  4-20%  gradient  SDS-PAGE  of  the  expression  levels  and  in  vivo 
cleavage  of  MCoTI-Lib2  precursors  using  different  cellular  backgrounds.  Expression 
levels  for  MCoTI-Libl  were  similar.  Induction  was  carried  out  at  30 °C  (2  and  4  h)  or  20°C 
(20  h)  by  adding  0.3  mM  IPTG. 


wt,  D12A,  R8A,  S13A,  G17A 


Trypsin-beads 


Figure  SI.  Analytical  reversed-phase  HPLC  trace  of  GSH-induced  cyclization  of  MCoTI- 
Libl  precursors  before  and  after  being  purified  by  affinity  chromatography  on  trypsin- 
sepharose.  Identification  of  the  different  MCoTI-l  mutants  was  carried  out  by  ES-MS. 
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Figure  S2.  Heteronuclear  ^{^N}  HSQC-spectra  of  recombinant  wt  MCoTI-l  wt  (black) 
and  K4A  mutant  (red).  Chemical  shift  assignments  of  the  wt  MCoTI-l  amino  acid 
residues  are  shown  in  black.  K4A  mutant  residues  that  changed  chemical  shifts  by  more 
that  0.3  ppm  in  proton  dimension  are  labeled  in  red.  Small  unassigned  peaks  in  both  wt 
and  K4A  spectra  of  MCoTI-l  are  from  a  minor  isomer  of  the  protein  due  to  a  known 
isomerization  of  the  backbone  at  an  Asp-Gly  sequence  in  loop  6  of  MCoTI-l.  NMR 
experiments  were  acquired  on  a  Bruker  Avance  II  700  MHz  NMR  spectrometer  equipped 
with  a  cryoprove  at  27° C.  NMR  samples  of  0.2  mM  of  [U-,  15N]  MCoTI-l  and  0.25  mM  of 
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[U-,  15N]  K4A  MCoTI-l  were  in  90%  H2O/10%  D20  adjusted  to  pH  3.5  by  addition  of  dilute 
HCI. 


GSH-induced  cyclization/folding 


Trypsin-bound  fraction 


ES-MS  analysis 

3521. 0±1. 4  Da 


12000- 

10000- 

8000- 


3300  3800  3800 

Man.  amu 


Figure  S3.  Analytical  reversed-phase  HPLC  trace  of  GSH-induced  cyclization  of  MCoTI- 
G25P  precursor  before  and  after  being  purified  by  affinity  chromatography  on  trypsin- 
sepharose.  Identification  of  the  folded  mutant  was  carried  out  by  ES-MS. 
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Figure  S4.  Analytical  reversed-phase  HPLC  trace  of  GSH-induced  cyclization  of  MCoTI- 
I20G  precursor  before  and  after  being  purified  by  affinity  chromatography  on  trypsin- 
sepharose.  Identification  of  the  folded  mutant  was  carried  out  by  ES-MS. 
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Figure  S5.  Analytical  reversed-phase  HPLC  trace  of  GSH-induced  cyclization  of  15N 
labeled  MCoTI-l  wt  and  K4A  precursors  before  and  after  being  purified  by  preparative 
HPLC.  Identification  of  the  folded  mutant  was  carried  out  by  ES-MS.  Expected  molecular 
weight  for  the  15N-labeled  cyclotide  is  shown  in  parenthesis. 


Table  SI.  Forward  (p5)  and  reverse  (p3)  5’-phosphorylated  oligonucleotides  used  to 
clone  the  different  MCoTI-1-intein  linear  precursors  into  the  pTXBI  expression  plasmid. 


Cyclotide 

name 

Oligonucleotide  sequence 

wt 

p5 

5' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

VIA 

p5 

5' -TATGtgcggttctggttctgacggtggtgcttgcccgaaaatcctgcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgcaggattttcgggcaagcaccaccgtcagaaccagaaccgcaCA-3' 

VIS 

p5 

5 ' -TATGtgcggttctggttctgacggtggttcttgcccgaaaatcctgcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgcaggattttcgggcaagaaccaccgtcagaaccagaaccgcaCA-3' 

P3A 

p5 

5' -TATGtgcggttctggttctgacggtggtgtttgcgctaaaatcctgcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgcaggattttagcgcaaacaccaccgtcagaaccagaaccgcaCA-3' 

K4A 

p5 

5'  -TATGtgcggttctggttctgacggtggtgtttgcccggctatcctgcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgcaggatagccgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

I5T 

p5 

5 ' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaaccctgcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgcagggttttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

L6A 

p5 

5 ' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcgctcagcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgctgagcgattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

Q7G 

p5 

5 ' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgggtcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgacccaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

Q7M 

p5 

5' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgatgcgttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aacgcatcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

R8A 

p5 

5 ' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgcaggcttgcc 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgacggc 
aagcctgcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

RIO  A 

p5 

5 ' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgcagcgttgcg 
ctcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgagcgc 
aacgctgcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

R10G 

p5 

5' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgcagcgttgcg 
gtcgtgactctgactgcccgggtgcttgcatctgccgtggtaacggttac-3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcacgaccgc 
aacgctgcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-3' 

R11V 

p5 

5' -TATGtgcggttctggttctgacggtggtgtttgcccgaaaatcctgcagcgttgcc 
gtgttgactctgactgcccgggtgcttgcatctgccgtggtaacggttac  -3' 

p3 

5' -GCAgtaaccgttaccacggcagatgcaagcacccgggcagtcagagtcaacacggc 
aacgctgcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA  -3 ' 

D12A 

p5 

P3 

5'  -TATGt gcggttctggttct gacggt ggtgtt tgcccgaaaatcctgcagcgt tgcc 
gt cgtgcttctgactgcccgggtgctt  gcatctgccgt ggtaacggt tac- 3' 

5'  -  GCAgt  aaccgt  t  accacggcagat  gcaagcacccgggcagt  cagaagcacgacggc 
aacgctgcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

S13A 

p5 

5'  -TATGt gcggtt ctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt  cgt gacgctgactgcccgggt get t  gcatctgccgt ggtaacggt tac-  3' 

P3 

5'  -  GCAgt  aaccgt t accacggcagat gcaagcacccgggcagt cagcgtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

DMA 

P5 

5'  -  TATGt gcggtt ctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt cgt gactct get tgcccgggt get t gcatctgccgt ggtaacggt tac-  3' 

P3 

5'  -  GCAgt  aaccgttaccacggcagatgcaagcacccgggcaagcagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

P16A 

P5 

5'  -  TATGt  gcggttctggttct  gacggt ggtgtt tgcccgaaaatcctgcagcgt tgcc 
gt  cgt  gact  ct  gact  geget  ggt  get t  gcat ct geegt  ggt  aacggt  tac- 3' 

P3 

5'  -  GCAgt  aaccgt  t  accacggcagat  gcaagcaccagcgcagt  cagagt  cacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

GL7A 

P5 

5'  -  TATGt gcggttctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt  cgt gactct gact gcccggct get t  gcatctgccgt ggt aacggt tac-  3' 

P3 

5'  -  GCAgt  aaccgt t accacggcagat gcaagcagccgggcagt cagagt cacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

A18G 

P5 

5'  -  TATGt gcggtt ctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt  cgt  gact  ct  gact  gcccgggt  ggt  t  gcat  ct  geegt  ggt  aacggt  tac- 3' 

P3 

5'  -  GCAgt  aaccgt t accacggcagat gcaaccacccgggcagt cagagt cacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

1  20G 

P5 

5'  -  TATGt gcggttctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt cgt gactct gact gcccgggt get t gcggtt geegt ggt aacggt tac-  3' 

P3 

5'  -  GCAgt  aaccgt  taccacggcaaccgcaagcacccgggcagt  cagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

R22W 

P5 

5'  - TATGt  geggt  t  ct  ggt  t  ct  gacggt  ggt  gt  1 1  gcccgaaaat  cct  gcagcgt  t  gee 
gt cgt gactct gact gcccgggt get t gcat ct get ggggt aacggt tac-  3' 

P3 

5'  -  GCAgt aaccgt taccccagcagat gcaagcacccgggcagt cagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

G23A 

P5 

5'  -  TATGt  gcggttctggttct  gacggt ggtgtt tgcccgaaaatcctgcagcgt tgcc 
gt  cgt  gact  ct  gact  gcccgggt  get  t  gcat  ct  geegt  get  aacggt  tac- 3' 

P3 

5'  -  GCAgt  aaccgt  tageaeggeagat gcaagcacccgggcagt  cagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

N24W 

P5 

5'  -  TATGt gcggtt ctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt cgt gactct gact gcccgggt get t gcatctgccgt ggt t ggggt tac-  3' 

P3 

5'  -  GCAgt  aaccccaaccacggcagat gcaagcacccgggcagt cagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

G25P 

P5 

5'  -  TATGt gcggtt ctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt  cgt  gact  ct  gact  gcccgggt  get  t  gcat  ct  geegt  ggt  aacccgt  ac-  3' 

P3 

5'  -  GCAgt  acgggt  t  accacggcagat  gcaagcacccgggcagt  cagagt  cacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

Y26W 

P5 

5'  -  TATGt gcggttctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt  cgt  gact  ct  gact  gcccgggt  get  t  gcat  ct  geegt  ggt  aacggt  t  gg-  3' 

P3 

5'  - GCAccaaccgtt accacggcagat gcaagcacccgggcagt cagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

Y26A 

P5 

5'  -  TATGt gcggtt ctggttct gacggt ggtgtt tgcccgaaaatcct gcagcgt tgcc 
gt  cgt  gact  ct  gact  gcccgggt  get  t  gcat  ct  geegt  ggt  aacggt  get  -  3' 

P3 

5'  - GCAagcaccgtt accacggcagat gcaagcacccgggcagt cagagtcacgacggc 
aaeget gcaggattttcgggcaaacaccaccgtcagaaccagaaccgcaCA-  3' 

Table  S2.  Summary  of  the  and  15N“  NMR  assignments  for  the  main-chain  protons 
(i.e.  N“-H  and  C“-H)  of  recombinant  MCoTI-l. 


Residues 

'H-^N01  (ppm) 

15N“  (ppm) 

'H-C”  (ppm) 

Val  1 

8.33 

120.96 

3.89 

Cys  2 

8.54 

126.39 

4.92 

Pro  3 

Lys  4 

8.10 

120.61 

4.17 

lie  5 

7.523 

119.968 

4.25 

Lue  6 

8.53 

125.78 

4.35 

Gin  7 

Arg  8 

8.61 

127.62 

4.37 

Cys  9 

8.34 

120.34 

4.49 

Arg  10 

7.97 

117.20 

4.58 

Arg  11 

9.36 

117.83 

4.35 

Asp  12 

9.32 

120.98 

4.76 

Ser  13 

8.27 

115.75 

4.68 

Asp  14 

7.65 

120.57 

4.44 

Cys  15 

7.97 

117.74 

4.82 

Pro  16 

Gly  17 

8.37 

106.63 

3.65 

Ala  18 

8.33 

125.14 

4.31 

Cys  19 

8.06 

117.00 

4.50 

lie  20 

8.9 

113.42 

4.24 

Cys  21 

9.03 

124.20 

3.95 

Arg  22 

8.02 

128.58 

4.17 

Gly  23 

8.79 

108.27 

3.80 

Asn  24 

7.70 

115.77 

4.52 

Gly  25 

8.30 

107.30 

3.82 

Tyr  26 

7.19 

116.66 

5.10 

Cys  27 

8.68 

120.72 

5.19 

Gly  28 

9.72 

109.62 

4.36,  3.73 

Ser  29 

8.71 

115.78 

4.34 

Gly  30 

9.06 

117.73 

3.71,4.22 

Ser  31 

8.58 

116.05 

4.27 

Asp  32 

8.31 

122.07 

4.49 

Gly  33 

8.08 

10844 

3.67,  3.89 

Gly  34 

8.05 

110.83 

3.67,4.14 

Table  S3.  Summary  of  the  and  15N“  NMR  assignments  for  the  main-chain  protons 
(i.e.  N“-H  and  C“-H)  of  recombinant  MCoTI-l  K4A. 


Residues 

'H-^NT  (ppm) 

1bN  (ppm) 

'H-C01  (ppm) 

Val  1 

8.32 

121.17 

3.80 

Cys  2 

8.68 

125.40 

4.86 

Pro  3 

Ala  4 

6.70 

124.03 

4.44 

lie  5 

7.57 

118.41 

4.21 

Lue  6 

8.40 

122.79 

4.27 

Gin  7 

8.36 

122.55 

4.30 

Arg  8 

8.51 

127.02 

4.30 

Cys  9 

8.36 

120.17 

Arg  10 

7.95 

117.17 

4.58 

Arg  11 

9.34 

117.80 

4.27 

Asp  12 

9.32 

120.91 

4.81 

Ser  13 

8.23 

115.57 

4.67 

Asp  14 

7.63 

120.54 

4.40 

Cys  15 

7.96 

117.81 

4.77 

Pro  16 

Gly  17 

8.31 

106.30 

3.65 

Ala  18 

8.36 

124.53 

4.31 

Cys  19 

7.93 

116.09 

4.63 

lie  20 

8.88 

113.42 

4.19 

Cys  21 

9.006 

124.15 

3.93 

Arg  22 

8.04 

128.53 

4.17 

Gly  23 

8.77 

108.24 

3.80 

Asn  24 

7.65 

115.71 

4.64 

Gly  25 

8.25 

107.23 

3.55,  3.89 

Tyr  26 

7.14 

116.71 

5.10 

Cys  27 

8.67 

121.05 

5.13 

Gly  28 

9.63 

108.99 

3.75,  4.30 

Ser  29 

8.78 

115.94 

4.37 

Gly  30 

9.06 

111.70 

3.72,4.22 

Ser  31 

8.55 

116.07 

4.28 

Asp  32 

8.29 

122.04 

4.49 

Gly  33 

8.01 

108.48 

3.67 

Gly  34 

8.04 

110.76 

4.65 
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